DOCOHEHT BESOHE 



ED 20B OOB 

A-OTHOR 
TITLE 

INSTITUTION 
SPONS AGENCY 



REPORT HO 
POB DATE 
COHTRACT 
GRANT 
NOTE 

EDRS PRICE 
DESCRIPTORS 



'IDENTIFIERS 



. ' Tfl 810 689 

Chromy, James R.; And Others 

Year 11 Primary Sample for the National Assessment of - 

Educational Progress. Final Report. ' 

Research Triangle Inst., Research Triangle Par*, N- C. 

Center fcfr Sampling Research and Design. 

Education Commission of the statues, Denver, Colo. 

National Assessment of Educational Progress. ; 

National Center for Education statistics (ED) , 

Washington* D.C.; National Inst, of Education (ED), 

Washington, D.C. 

RTI-1764-00-00F 

Aug 81 * 
OEC-0-74-0506 ' 
NIE-G-80-0003 
73p. 

MF01/PC03 plus Postage. 

Asian Americans; Computer Programs; ^Educational 
Assessment; Elementary Secondary Education; Hispanic 
Americans; ^National Competency Tests; Research 
Design; *Sampling; Testing Problems; Testing" 
Programs ' 

♦National Assessment of Educational Progress 



ABSTRACT ; 

The primary sample for Year 11 of the National 
Assessment of Educational Prcrgress (NAEP 9 ) vas selected in Harch 1979, 
and was preceded by an 18-month" planning effort. During the planning 
period, research* concentrated in five specific, areas: sampling frame 
construction, stratification criteria, efficiency si udy review, 
techniques and computer software for highly stratified sample 
selection, and sampling for Asian and Hispanic populations. Primary 
samples from the first ten years are reviewed, and the sampling frame 
construction is discussed. The actual selection of* Samples,, the 
sample stratification, options for large and small annual samples, 
selection techniques, and sampling for special populations^are 
discussed. Primary type 6f information provided by report: Procedures 
(Sampling) . (Author/BW) ^ 



************ ************************** ********************************* 

* • Reproductions supplied by EDflS are the best that cau be made • * 

* ' * from the .o/iginal documents * 

****<******************************************************************* 



9 



RTI/1764/00-0f)F' 



8 s 

Final Report 



US, D£PA|tTM£NT OF EDUCATION 

NATIONAL INSTITUTE OF EDUCATION 
EDUCATIONAL RESOURCES INFORMATION 

CENTER (ERIC) 
^CThts document has been reproduced as 
received from the person or organization 
- oogmating it • 
CJ Minor changes have been made to improve 
reproduction quality^ ^ 

• ^P«nts of view or 6pinions stated in thisdocu- 
ment do not necessanfy represent official NIE 
. position or policy 



YEAR 11 PRIMARY SAMPLE FOR THE 
NATIONAL ASSESSMENT OF EDUCATIONAL PROGRESS 
f 



by 



James R. Chroiny 
Anne F. Clemmer 
Bruc^ L. Jones 



J 



b Sampling Research and Design Center 
Research Triangle ^Institute 
Research Triangle Park, North Carolina 27709 



Prepared for 
National Assessment of Educational Progress 



August 1981 



T 4 R-I A N 6 L E PAR K,,.N*OR T Fj C A R Q L I N A 27709 



TABLE OF CONTENTS 



INTRODUCTION >',... 

"1.1 Planning Period Activities 

« 

1.2 Sample Overview 
'1.3 Report Organization . < . , 



TEN YEARS 1 PRIMARY SAMPLES 



2.1 Common Elements ...... 

2-2 Differentiating Elements 

2.3 Summary Characteristics . 9 



SAMPLING FRME CONSTRUCTION 

3.1 Sampling Frame Units v ...... . 

3.2 Sampling' Frame Variables and Data Sources . . . . 

3.3 Editing and Verification Procedures 

STRATIFICATION 

4.1 Overview \ C. 

4.2 Sample Allocation by Region and SDOC Category . . 

4.3. Meeting the All-State 'Requirements Over a 

Four-Year Period . . . . . 

4.4 Selecting .the Sample • 

'(OPTIONS FOR LARGE AND SMALL ANNUAL SAMPLES ....... 

5.1 Primary Sample r >.-...;.. \. . 

5.2 Secondary Sample A. 

^ECTION TECHNIQUES ............ 



SPECIAL POPULATIONS 
' REFERENCES 



\ 



LIST OF TABLES 



Table 
2-1 
2-2 

2-3 

4-1 

4-2' 

4-3 

4-4 

4-5 

4- 6 

5- 1 
1 5-2 

7-1 

7-2 

7-3 

7-4 " 

7-5- " 



National Assessment Reporting Categories 



Definitions- of National Assessment Regional Sub- 
populations^ 

f 

Summapy Characteristics From'Firs^ Ten NAEP Samples 

Sample Allocation by Regiop ^rid SDOC Categories 

Allocation in*Terms of l~X^Tf 9 and 3-Replicate Units 

Five-Sample Allocation by Region, State, and SJDOC , 

Serpentine Ordering of States Within Region 

Illustration of Serpentine'Ordering of Sampling Frame 

Asgignment/6f Selected Units to Years * 

Partitioning Supplemental Samples into^T^Subs^mples . 

Niambersjof Schools per Replicate and per PSU for 
Each Option . . . ' ' . 

Weighted Estimates of Minority Population From 
4-Year Primary Sample . ........ 



Weighted Estimates of Minority Populations by 
Region and SDOC for Year 11 



Weighted Estimates of Minority Populations by 
Region and SDOC for Year 12 



lighted Estimates of Minority Populations by * 
Regi^Vand SDOC for Year 13 : . . . 



"Weighted Estimates of Minority Populations by 
Region and SDOC for Year 14 * . *. 7 . . . l". 



7-6 Weighted Estimates of Minority Populations by 

Regitui and SDOC for Supplemental Year 14 . . 

7-7 . . Weighted Estimates «of Minority Populations by 
. Region and SDOC for. Frame . 



Page 
7 

•8 
19 

42 . 

44 

46 

48 
.52 
.55 

56 ' 

60 

61 

62 

.63 

64 

65 

66 



.ERIC 



1." INTRODUCTION 

- » • 

This report is, submitted to the National Assessment of Educational 
Ptogress (NAEP) and constitutes thfe final report for the coordinated four- 
year primary sample commencing in Year 11. The sample was selected in 
March 197/^andT^as preceded Jby a 18-month planning effort. During the 
planning period,, primary designs from the— friret teL-ygaxjs.^e,re _examined in 
£&*ms of strengths and weaknesses f design efficiency studies, conducted in 
Year 07 were re-examined, and direction of the sample over the next four 
years was^jdlstussed. 
1.1 Placing Period Activities 

During the planning period, research concentrated in five specific 
areas each of which are discussed below. . • 

1.1.1 Sampling Frame Construction ^ 

A minimum set of variables to be included on the. sampling frame was 
developed and additions were made as appropriate. All sampling 'frame 
information was organized at the 197Q Censors-defined county level. The 
final set of variables is discussed in Chapter 3. 

1.1.2 Stra tification Criteria 

— ; — * 

Stratification criteria used in previous assessments were reviewed. 
Potential stratification variables related to region, race or ethnicity, 
community characteristics, and occupation were included on the sampling 



frame. - The existence of the ' sampling frame and various stratification 
variables permitted the testing of different stratification and sample 
selection strategies. • * 4 4 



r 

1.1.3 Efficiency Study Review 

Variance component estimates from the Year 07 design efficiency studies 
were re-examined. The % sample design planned for yean 11 was found to be 
generally consistent with 'the findings of the eff iciency % study and the 
special requirements of NAEP for domain estimation. * 

e 

1-1.4 Techniques and Computer Software for Highly Stratified Sample' 

* Selection * " 

v * . — . • 

The final product of this research was the computer software required^ 

to order listing units in a serpentine fashion aad form equal sized zones 
t 

from which on£ unit was selected. "The stratification and zone .formation 

- . * ' J 

techniques are detailed in sections 4.4r.2 and'4.4.3, respectively. The - 

sample selection process is discussed in Chapter 6. 

*s * J. 1.5 Sampling for Asiaq and Hispanic Populations 



# 



Appropriate 1970 Census data useful for identifying Hispanic 'of Asian 

i - • / 

populations in primary sampling* units composed of 'pottnties were included in 

the sampling fr^me ydata* s^t . * * B> 

An alphabetic list of Spanish surnames was obtained from the Bureau of 

the ^Oensut. The list could b^J used to identify ajid oversample Spanish 

students in schools. No comparable* list existed for Asian names. Spanish 

surname identification procedures were pretested at six school locations 

during quality check ^visits . Generally favorable results were reported. 

Specific sampling procedures adopted for special populations are 

^discussed jju-Cliaptex - - 

1.2 Sample Overview .]. . — — • — - < y 

J V 

The National Assessment sampling design is a three-stage stratified 
probability sample. Stratification variables include region, community 



ERIC , • ." . 6 - 



i 



. r 3- 

¥ 

O 

" - • / 

size, and socioeconomic status. Th£ selectiorv^^r the primary sample is 
only the first step in the process. An overview of the general sampling ' * 
'and weighting process is included here for completeness * and reference. 
The National Assessment sample is ' designed to be representative of 
students in thrge age classes, 9-, 13-, and 17-year-olds, in all schools 
and communities, in the nation^ It is also designed to produce, 'for a 
variety of ssubpopulations , performance estimates which are relatively 
unbiased and. which meet certain precision requirements. 

Primary Sampling Units fPSUs) are geographic land areas consisting of 
a single county or several counties. Each year approximately 83 PSUs are 

* ' j" 

randomly selected on a probability basis so that every county and every 
state in the United States has a positive chance of being included in the 
sample. 

k 

At the second stage of sampling, a list of all Schools, both public 
and private, within each of the selected fSUs is ^developed and a probabi- 
lity sample of these schools is selected for each of the three, age classes.. 
n The number of schools selected in each PSU is determined by the approximate 
^ number of students in the eligible age group attending each school. 

Schools are selected in such a way that any given school will not appear in 
the sample more than once in a four-year ' period. In most years^ about 
1 ,600* schools are selected; ^ the number selected in a particular year de- 
pends upon the number of distinct packages. / ^ 

The third ancl' final* stage of sampliitg is the selection of a random 
sample of students from the eligible age group at each selecfcfetj School. A 
to£al of approximately 2,600 respondents is* obtained for eich National 
Assessment' package. Generally, the students are selected from one to eight 

\ ' 



ERIC 




v 

•9 



schools within each selected PSU for each of the three age groups being 
assessed. 

Selected students who. do not* show up for assessment &re termed non- 

• * 
respondents. Response rates for 9- and 13-year-olds tend to average about 

85 percent, whereas the response rate for 17-year-olds averages 75 .percent , 
Seventeen-year-olds who miss 'their appointments are followed up in ,school 
t\\e day after the assessment. Seventeen-year-old dropouts and early gradu- 
ates, are^ located in their homes and administered packages. According to 
census data, about 10 percent of the 17-year-olds are not enrolled in 
school. Including these out-of-school individuals in the target population 
enables National Assessment to apply its results to the entire population 
of, 17-year-olds rather than only to those enrolled in school. The assess- 

ment of dropouts and early graduates is 'termed the- Supplementary Frame 

•* | 
* assessment. ' 

Sample weights adjusted for nonresponse are computed^^or each age 
class. The weights are calculated as the reciprocal 1 of the Appropriate 
selection probabilities. Sample weights are used to calculate ratio esti- 
mates of the proportions of population members whp respond in alternative 
ways to asj^smeut exercises. So .that thp proportion of population members 
who respond in alternative ways can be calculated baTsed on community loca- 



tion and occupation of parentis, the assessment data are postclassif ied into 
seven size and type of community (STOC) categories. - 
lc3 Reporf Organization 

' " The primary sample, ^planning .period activities are reviewed in the 
initial chapiters. Primary samples from the first ten years are reviewed in 
Chapter 2, and the sampling frame construction i*s discussed in Chapter 2. 



The actual selection of the sajnple is discussed next; the sample stratifi- 

cation, options for large and small annual samples, and selection techniques 

• » * 

are- detailed in Chapters 4, 5 and 6, . respectively , Sampling for special 

! 

populations are discussed in Chapter 7. 



, 2. TEN YEARS' PRIMARY SAMPLES . 

In the sections which follow, the primary samples from the first ten 

years of the National * Assessment of 'Educational/ Progress (NAEP) are com- 

J 

par-ed. Similarities are cited in section 2.1 while differences ajIT noted 

in section 2.2, A summary of the characteristics from .each year's sample 

is provided in section 2.3." 

2 . 1 Common Elements / ~' 

r . - 

National Assessment reports results for a variety of subpopulations . 

Besides the three in-school age groups , reported subpopulations include 

k i 

within each' age level four geographic regions, sex, race, grade, f our - 
levels of parents 1 education, and seven size and type of community (STQC) 
categories. These reporting groups are"listed'in table 2-1. 

A major "objective of the National Assessment survey design is to 
guarantee adequate sample representation for, the reporting subpopulations 
listed in table 2-1. Such representation is essential if reasonably pre- 
cise comparisons' among these subpopulations are tp be made within a given 
assessment^ year and with previous years when the same subject areas were 
assessed. For these' reasons the primary samples for the first ten years 
have- always included stratification by region and community and oversampl-. 
ing-of low r socioeconomic sujbpopulations . These three topics are discussed 
in the sections which follow. * 

2. m Stratification by* Region 

The geographic regions referred to in table" 2-1 are those used by the 
Office of Business Economics , x Department of Commerce. Table 2-2 defines 
NAEP's regians in terms of the sets of St^tesN^hich, comprise the four 
geographic areas. Consistently in Years 01 through 10, this same set of 
regional strata has "been used. 



'Table 2-1. National Assessment reporting categories 



Classification 



Number of . \ 
subgroups \ 



Subgroup names 



Age level 
' Sex 
Race 

Geographic region 



Level of parental 
education 



Size and type of 
community (STOC) 



Grade 



3 
2 
4 
4 



V 



3 (9 f s,13's) 
4 (17'sO 



\ 9-, 13-, 17-year-olds 

Hale, Female ^ * 

White, Black, Hispanic, Other 

Northeast, Southeast* Cenpral, 
West. 

♦ 

No high school 
Some, high schooJL 
Graduate high school 
Post high school 

Low metropolitan (extreme inner 
city) ^ 

High metropolitan (extreme 
affluent suburb) 

Extreme rural * > ^ 

pjlain big citj[ (remainder of x 
big city) 

Urban fringe (^suburban fringe) 

Medium city 

Small places (^mall city) 

3,4, Other 
7,8, Other 
10, ,1 1,12, Other 



-8- 



Table 2-2. Definitions of National Assessment! 

regional subpopulations ^ 



i 



Northeast 

Delaware 
Connecticut 
Maine ^ 
New Hampshire 
£hode Island 
Vermont 

District of Columbia 

Maryland 

Massachusetts 

New Jersey 

Pennsylvania 

New York 

Central 
Iowa 

"Kansas * 

Nebraska 

North Dakota . - 

South Pakota 

Minne§,ot£ * 

Missouri 

Illinois 

Indiana 

Michigan. 

Wisconsin 

Ohio ' 



Southeast 

Arkansas 
Florida 
Virginia 
West Virginia 
Alabama * 
Georgia - 
Kentucky 
~ Louisiana . * 
Mississippi 
North Carolina 
S6uth Carolina 
Tennessee 

West 

Alaska 

Hawaii 

Idaho 

Montana 

Nevada 

Wyoming 

Arizona . 

Oregon 

Utah k * 
Colorado % * 
New Mexico 
Oklahoma 
California 
Texas 

Washington 



2.1.2 Community Stratification - * 
In order to insure- propex sample ^presentation in the seven STOC ^ 
^ categories, community, stratification must* occur at the primary sample 
^election level. The form of * community stratif icat.ibn^ has varied from 
year-to-year. . In Year 01,' areas within a county were classified. In all* 
successive ,years, classif ication^ias t>e* at the county-level. Th^re were 
^ four types of comnffinity classifications in fear 01. They included: 
i % ' • * Large -central cities; fc v* s *• ' • 

fck « Fringe areas of the large central cities; ■ ' . 

* * . • - * * * * * 

'« Middle siz#d cities; and m • • 

Rubral and small towji areas. / * , . • f \ * 

In Year^ 02 and. 03, four precise county-level fcize of community (§0C) 

. ^^lefinitions were developed in terps of I960 Census data: 

. S0C1 - all counties containing a central city with a population of 
to * 180^)00 or more, 

S0C2 - all counties in the ^S^me/Stahdard Metropolitan Statistical 
t Area 4 (SMSA) as S0C1 county, 

S0C3 - * all counties not included in S0C1 or S0C2 tha£ are either 3 
\* *• jParflr/of an SMSA oxT that contain at least one city N with a 

population of 25,0tfb or. more, 

' i S0C4 - all counties not included in S0C1, 2, oi^3. 

In Years 04 through 10, SOC was^defined in terms of 1970 Census data. The 

Year 04 def initiony were similar to Years- 02 and 03 excep^l) the size of 

. the central city required to define S0C1 was increased from 180,000 to 

350, 000" and (2) S0C2 also included all counties -with a centra}, city of 

, 150,000 to 350,000 population. 

... 

- , - % * In Year 05, to facilitate stratification of the school sample alt>ng 

, • si^e and type pf community lines, SOC was defined to \ include entire 1970 

\ ♦ 



ERIC / . . , * • , 13 



SMSAs. S0C1, 2 y , and 3 consisted" 6f* entire SMSAs and *SOC4 and 5 were non- 

/, » > * 

SMSA counties: * ' 

* 

SOC1 - the largest 15 SMSAs based on adjusted ^4-year-old popula- 
tion (-self-represerfters) ; 

S0C2 - the, remaining 55 SMSAs with total population in excess of 
500,000; 

r ' 

S0C3 - / * th'e Remaining 162 SMSAs. 

S0C4 - ^ non-SMSA counties with 60 percent or less- of their l4-yea,r- 
old peculation classified as rural in the 1970 Census; 

S0C5 - . non-SMSA counties with more than 60 percent of' their 14- 
year-old popi^lation classified as rural in the J970 Census. 

S0C4 and 5 were defined 'to include about equal numbers of 14-year-olds in 



-year-olds 
wKeh Year 



1970. Fourteen-year-olds in 1970 ^wotitd* be aged 16 in 1974 wtfeh Year 05 
assessment was ' conducted . The closest to 17-year-old single' age reported 
by urban and rural classification on the J$70 Census data tapes was 14- 

year-olds. The Year 06 definitions were /very similar to Year 05 excegt (1) 

<* ( 

Denver and Phoenix were removed from S0C2 and added* to,/S0Cl as self-repre- 

sinters and (2) S0C4 and 5 were defined terms of non*SMSA primary units 

rather than counties. S0C4 consisted* of fcf^pse primary units wij:h less than 

* 

65* percent of their 14-year-old population classified as i*ural in the 1970 
Census. S0C,5 contained those /inits with 65 percent or more of their 14- 
year-olds cl ^HS9K s rural. 



The^ Year 06 definitions continued to be used in Years 07 through 10. 

12.1.3 Oversampling of Low Socioeconomic Subpopulations 

■ i 
, \ 

• NAEP* reports results for % ST0C categories (see table 2-1.) In Order 
to accurately report results for the first 2 jcategories, low socioeconomic 
subpopulations in the large cities and rural areas must be isolated and 
overs&mpled. The methods for m locating and oversampling these populations 
in the primary sample has varifed over the years (-see section 2.2.2). 



14 



'-ii- 

2.2 Differentia^fngElements 

The primary, samples for the first ten years of NAEP have differed as 

-A 

to annual allocation by state, locating and^ove^ low socioeconomic 

status (SES) subpopulation, and allocation of second stage or- schbol units. 

c I 
Each of these differences' is discussed in the sections which follow. 

2/2.1 Control of State Sample Allocation k 

In Year 01, no control was exercised over. sample allocations to states 
As a re'sult the primary sample included selected units in 38 of the 50 
states, 4 Beginning in Year 02, each state had to be represented in the 
primary • sample annually. This requirement extended through Year 06. In 
Tear 07, a coordinated four-year primary sample was selected extending 
through Year 10. The four-year sample required that each state be repre- 
sehted at le'ast once over the four-year period, 

V The all-stat^ requirement was met in Years 02 through 04 by using a 
controlled selection procedure developed by Jessen. For each region, a 
table was prepared containing" estimated adjustec^ 17-year-olds (oversampled 
17 r s counted twice) by state and major primary stratum. Major primary 
strata consisted of the 8 ^ategories obtained By crossing the 4 size of 
•community strata with the low and hi%i (2) SES strata. The tc/tal sample 
allocation of 216 replicates were allocated t to regions in proportion to 
adjusted J7-year-olds in the region. The sample allocation to the region 
was then apportioned among the state by major stratum cells in proportion 

to the adjusted 17-year-olds in each cell. States whose allocation by thts 

m • , 

procedure was .5 or more were designated % as two-replicate PStt states . 

Remaining states were called one-replicate PSU states. In a singffl^ repli- 

eate state PSU, each package for each age class was administered once with 

approximately 12 respondents per session. A total of 216 replicates (208 

J 



in -Year 01) were assigned yielding the^lanned sample^ size per package of 

.2592(12 x 216)/ . ' ' 

Having comprewd' tlje expected allocations described above, the next 

step was the coritrojled selection: of a sample pattern and the selection* of 

* * 

the sampl^ primary units gjiventha selection pattern. Controlled selection 
insur.ed'thaji the actual sample allocatipn to any cell was Within one of the 
expected/^loca^ioji. A set of -allpcations or patterns was developed and 
probatii^i|j.es % weSse assigned to these patterns to meet two' requirements (1) 
each pattern musty satisfy certain row and column totaj. constraints exactly 
(i>e., the alloqation tcreach state must be at least; 1); and (2) in repeat- 
ed sampling 'of the ^patterns, the overall probability of including any 
particular cell was fixed. A separate set of patterns was developed for 
ea^h region and, one. was selected using the probabilities assigned to the 
patterns. v Having determined the selection pattern, the prescribed number 

. . * • / 

of upits was , selected from each cell ip proportion to the adjusted numbers 

* , - *■ 

of 17-year-olds I # 

The variances associated with estimates derived from controlled selec- 
tion samples a^e very complicated. Furthermore, they are biased in the 
sense that they are overestimates of. the variance. 

In Year 05 w the method of controlled selection was abandoned in favor 
of a deeply stratified design -which met the all-state requirement and at; 
the ^ame timfe provided simple, relatively unbiased estimates of variance. 
To insure adequate, regional representation, sample PSUs were allocated to 
NAEP's four regions in proportion to the adjusted size- measure described 
above* The 15 largest , SMSAs had adjusted size measures big enough to 
warrant their inclusion in the sample with certainty. Two additional SMSA 
PSUs, , namely Denver and Phoenix, became self -representing by virtue of the 



-13- 



) 



stratification* scheme used to meet the all-state > requirement Non-self- 



representing PSUs wece selected with probabilities ^strictly proportional to 

their adjusted '17-year-old population. RTI's approach fotf meeting the 

^ alL-state requirement was to delineate, primary stage substrata .Within 

_ states which were not already represented by one of the 17 self-represent- 

ing SMSAs. States were first designated as one-replicate or two-replicate 

PSU states; a two-replicate PSU state had at least 50,000 total population- 

in each PSU and the PSU was assigned two full sets of NAEP packages for 

each a^e class. One-replicate states had a 25~,000 popdlatiJn minimum for 

each PSU and the sample £SIL^s__a^slgned a s i n gj e set 0 f-WAEP packages . 

States whose adjusted size measure warranted a proportional allocation of 

two or more double-replicate PSUs out of a-totai national allocation of 216 

replicates ha'd their non-SMSA counties aggregated to me£t the 50,000 size 

minimum. The non-self-representing v SMSA PSUs (S0C2 and S0C3) were then 

ranked from largest to smallest in terms of adjusted size'. The non-SMSA 

^PSUs (S0C4 and S0C5) weAre rajxk^^om jLeast rural to most rural based on Z. 

the percentage of rural 14-year-olds. Starting with the largest. SMSA *~ 

units , adjusted size measures were accumulated down the^ ranking until 
, * . n ■ 

enough size was aggregated to warrant a -proportional allocation of a pair 
^* "* . ' 

of two-replicate PSUs. Two P,SU;s were then selected from this state sub- 
« 

stratum. Any remaining units not included in the j^rgest and least r\iral' 

aggregate were, placed in a regional pool with similar units from other 

* *" * ' 

states. 0 

* » - - 

States whos<* Aggregate adjusted size did not warrant a pair of two- 
* 

replicate units were classified as one-replicate states. Their ^on-SMSA 
units were combined to meet a 25,000 minimum population requirement and 
then ranked from least to most rural behind the SMSA units. Units were 



ERIC . . ~ -17 \ 



/ 



again combined down the list until the aggregated size deserved a propor- 

tional allocation of a pair of single-repljxate units.. Two of these one- 

replicate^ PSUs were then selected with probabilities strictly proportional 

to adjusted size and without replacement. 

In order" to exercise some control over the sample distributions of 

PSUs by size of community, those PSUs in the primary frame which belonged 

to states already covered by self- representing PSUs and those remaining 

after appropriate sized substrata were 'carved from the non-self-represent- 

I 

ing states were placed in a regional pool. Units in the Regional pool were 
first stratified into one- and two-replicate PSU substrata and then ranked 
by size and percent rural . ^Additional strata were* formed along the size- 
rural ordering so tha't each stratum deserved a proportional allocation of 
two or three units per stratum. 

> The Year 05 procedure was repeated in Year 06.. As noted earlier\ a 
« \ 

four-year sample was selected 'in. Year 07 for Years 07 through 10. The 
Year 05 procedure was applied to the four-year sample. It was also decided 
to reduce the number of f primary sampling units or travel points in the 
£our-ye&r sample. This chang^ was motivated by 'the reduced funding level 
anticipated for Years 07 thrcmgh 10. The NAEP sample for Years 02 through 
06 contained roughly 115 distinct travel points with each group package 
scheduled for 216 group sessions of . 12 students. To maintain the same 
sample size for froup packages with a drop to roughly 70 travel points per 
year and 162 group sessionsrper package, the planned group session size^was 
increased to 16. Since., each group t session for 'a particular package is 
conducted in a separate school, one notes that the design change introduced 
lh Year 07 also implies a reduction in schools assessed per packag'e from 
€16 to 162. Thus,* the four-year sample Consisted of '648 replicates (162 ,x 4 



These replicates were allocated to regions and to states' and regional pools 
within regions "exactly as described in tjie preceding paragraph. However, 
the*allocation and selection consisted of 8 (4» x 2 single-, or double-repli- 
cate units) PSUs instead of 2. T&e 8 selected PSUs were then randomly 
assigned in pairs to each of the four year?. 

2.2.2 Definition of Low Socioeconomic gtatus 

In Years 01 through 04, low socioeconomic subpopulations were defined 
in terms of percent of population earning less than $3,000 which ^at t^he 
time was* the national poverty level. Approximately 20 percent of the 
lowest SES-ranked primary units in each region and S0C3 and 4 category were 
isolated and oversampled at a rate o£ about 2 to 1 by doubling the selec- 
tion size measure (i.e., estimated 17-year-olds) in each unit prior to 
selection. Foj? S0C1 an<J 2 primary units, low SES schools were oversampled 
within each unit. * . ^ 

In Years 05 throygh 10, low SES in uirban areas (low metropolitan) was 
oversampled in a*different fashion from rural low SES areas (extreme rural). 
The use of the -percent of population earning less than $3,000 annually to 
identify low SES subpopulation was- abandoned. The two new methods are 
explained below. < 1 , 

2.2.2.1 0versampj.ing low metropolitan subpopulations . Low 
income inner city ar^as within the largest 65 SMSAs with total populations 
in excess of 500,000 were isolated. Census Employment Survey (CES) low 
in-come inner city Census tracts were used to define low metropolitan areas 
in the 40 SMSAs where such tracts had been identified. For the 25 cities 
among the largest 65 *SMSAs where CES areas were not defined, compact groups 
of inner city Census tracts with low income characteristics similar to* the 
CES areas were defined. Oversampling was accomplished by doubling* th£ 



f m < 

estimated numbers of 17-year-olds in • these areas prior to primary selec- 
tion. The t low metropolitan areas contained 7-3 percent of the 1970\4-* 
year-olds. * / 

• 2.2.2,2 Oversampljfag extreme rural subpopulatiofts . The extreme 

rural* subpopulation was associated with the rural portions of counties 

whose 1970^4-year-old population was at least 75 percent rural. The 

estima^J^ numbers of 17-year-olds were doubled 4>rior to primary selection 

ih counties which were 75 percent or more rural. These extreme rural areas 

> 

accouat_for 10 percent of the 1970 14-year-old population. 
2.2.3 Allocation of Second Stage Units 

The manner in which second stage units or schools were all^c^ted to 

selected primary sanipling units (PSUs? has varied over the years. The 
* 

differences'* can be categorized into tws types— definition of seco#d stage 
units and oversampling low SES in second Wage units. Eact\ of these topics 
is discussed in the following sections. 

w 2,. 2.3.1 Definition of second stage units . In Years 01 through 
04, schools were defined as thtf ^econd-stage sampling units (SSUs) in 
smaller primary units. In most of the large* PSUs, local area schools were 
clustered as SSUs to reduce the number of srhool districts in the sample 
and to reduce travel costs between sample schools in the same PSU. In 
forming" these clusters, particular emphasis was placed on representin&^some 
of'bdth high and low SES populations. In other words, the clusters/ were as 
heterogeneous as possible" with respect to SES. Additionally, in Year 02 to 
reduce field costs, a procedure developed by Keyfitz was modified to maxi- 
mize the probability of selecting the same SSU, for more than one age 
class. 



so 



-17- 



» The Years 05 through 10 secondary samples Were designed to allow for 
simple, unbiased variance estimation. To achieve this purpose, the school 
frame in large *PSUs (i.e., selfrrepresenting units) was stratified into 
two- and three- replicate zones containing populations of similar types\ 
. ^or example, in a particular self-representing unit, one two-replicate zone 
might consist of low metropolitan and remainder of the city schools; the 
second zone containing only schools from outside the city limits could 
account for another two replicates. To simplify estimation of the within 
PSU variance contribution from self-representing SMSAs, schoois were' select- 
ed in two or three nonoverlapping one- replicate subsamples which would 
easily accommodate the paired selection variance scheme*. 

2.2.3r2 Oversampling low SES in second stage units . In Years 01* 
through 04, schools within each selected primary unit were stratified into 
two strata, high and low SES, based ox^ an SES index. The SES index was 
calculated from Interna^ Revenue Service tax return tabulations by zip cocie 
i areas. Each school in a particular five-digit zip code area was assigned 
an SES index equal to the proportion of total tax returns with less than 
$3,000 adjusted gross income. For S0C1 and 2 primary units, the low SES 
stratum consisted of schools with one-third of the estimated students and 

the highest 1 values of the SES index; the remaining two-thirds of the 

•J 

t estimated students comprised the high SES stratum. For S0C3 and 4, the low 

* • 

SES stratum wa's schools with one-half of the estimated students and the 
highest values of the SE&fcindex; the remaining one-half of the estimated 
students constituted £he high SES stratum. SES stratification was effected 



1 Note that a high value of the SES index for a zip code area indicates 
a high proportion of low SES individuals... a "low SES ,f area. 



21 



separately for each of the 'three age groups. • Approximtely one-half of the 
sample schools were selected from each of the two SES strata within (each 

4 . 

primary unit. 

■ \> 

In Years 05 through 10, low SES school strata were defined in terms of 
1970 t Census data. Specifically, low metropolitan schools were those 

schools located in the Census Employment Survey (CES; low income areas. 

f \ 

Extreme rural schools were defined as schools located in non-SMSA counties 
wftere the 14-year-old populations in 1970 were at least 75 percent rural. 
The sample allocation to schools was made after estimated eligibles in low 
metropolitan and extreme rural schools had been doubled. This procedure 
had the effect df oversamplYng low SES schools at a rate of approximately 
two-to-one in relation to nonoversampled schools. 

THe procedure used in Years 05 through 10 was felt to be superior tb 
that employed in earlier years because the oversampled population varied 
from one primary unit to the next, and the later procedure took advantage 
of. this fact. The earlier procedure maintained a fixed oversampling rate 
per primary unit regardless of the size of the oversampled population in 
the PiSU. ' , % 

2.3 Summary Characteristics 

Table 2-3. summarizes characteristics from the first ten ^AEP sampl^^ 
t>y year and age class. In Year 01, the sample consisted of 208 replicates. 
The number of replicates was increased to 216 and controlled selection was 
used as the'method of primary sample selection in Years 02 through 04. A 
major sample redesign occurred and controlled selection was abandoned as 

the method of primary selection c in Years OS^and 06. A coordinated four- 

■5—- ' V* 

year primary sample was selected *in Years 07 through 10, and the number of 

c 



22 



Age 

& 

Year 



Age 9 



Table 2-3. Summary characteristics from first ten NAEP samples 



t 



Reps 



Schools-/ 



Schools 
per rep 



Group package 



b/ 



b/ 



) Students-^ 
Nb. Total Per pkg 



No. 



Individual packages 
Students- 
v 1 Total fer pkg 



01 


208 


935 


-4.50 


8 


19,478 


2,435 


2 '* 


3,715 


1,858 


01 


- 216 


. 1,007 


4.66 ; 


9 


22,366 


2,485 


3 


6,433 


2,144 


03 


216 


782 


3.62 


4 


10,82^ 


2,706 


3 


6,95.3 


2,318 


04 


216 


971 


4.50 


7 


18,639\ 


2,663 


3 


6,769 


2,256 


05 


216 


1,246 


5.77 


10 


26,053 J- 


2,605 


1 


2,233 


2,233 


06 


216 


2> 003 


4.64 


12 


• 28,932 t/ 


2,411 


NA 


NA 


NA 


07 


162 1 


'% 412 


2.54 


4 


9,860 
. 17,360 


2,465 


1 ■ 


2,306 


2,306 


08 


162 ' 


. 451 


2.78 


7 


2,480 


NA 


NA 


NA 


•09 


162 


. 465 


2.87 


7 


17,190 


2,456 


m-< 


NA 


NA 


-10. 


162 


539 


3.33^ .. 


11 


27,620 


< 2,511 


Mk 


/MA 


NA 



'Age 13 




• 


' V 




• 










01 


208 


749 


3.60 


9 


21,725 


2,414 


3 


5,582 


1,861 


02 


216 


\ 1.029 


4.76 


' 13 


32,328 


^2,487 


2 


V,307 


2,154 


03 


216 


913 


4.23 


7 


18,669 


2,667 


3 


6', 870 


; 2., 290 


04 


216 


979 


4.53 


9 


23,503 


2,611 


3 • 


6,744 


2 ,'248 


05 


216 


1,278 


5.92 


14 


36,080 


2,577 


1 


2,239 


2,239-, 


06 ' 


216 


972 ■ 


4.50 


13 


30,963 


' 2., 382 . 


NA 


, NA 


NA 


07 


162 


549 


3.39 


12 


29,901 


2.492 


NA 


NA 


' > NA 


. 08 


162 


472 


, 2.91 


10 ' 


25,663 


2=1566 


NA 


NA 


NA 


09 


162 - 


442 


2.73 


"11 


26,665 


2,424 


NA 


NA 


• NA 


10 


162 


' . 496 


3.06 


,13%' 


37,412 


2,771 


NA - 


NA 


NA 


\ Age 17 




















01 


208 


670 


3.22 


11 


23,348 • 


- 2', 123 


2 


3,443 


1,722 


02 


216 


631 


2.92 


10 


23,348 


' 2,335 


2 


4,274 


2,137 


03' * 


216 


759 


3.51 


9 


21,233 


2,359 


3 


6,565 


2) 188 


04 


216 


798 


3.69 


11 


25,908 • 


2,355 


3 


6,507 


2,169 


' 05 , 


216 


1,052 


4.87 


» 16 


36,709 


2,294 


1 


2,163 


2,163 


06 


216 


830 


3.84 


19 


41,286 


2,173 


NA 


NA 


NA 


07 


162 


439 


2.71 


13 


29,049 


2,2,35 


NA 


NA 


'NA 


08* - 


162 • 


428 


2.64 ' 


14 


37,174 


2,655 


NA 


NA 


NA 


09 


162 


453 


2.80 


14 


31,576 


- 2,255' 


NA 


' NA 


'NA 


10 


162 


435 


2.69 


14% 


37,083 


2,557 


NA 


NA 


NA, 



a/ 

- *• Counts onJ.y schools where assessment was conducted. 

* -/ Includes regular sessions and stajidby sessions. + 

1 c/ * - * 

- Excludes followup session counts and includes alternates. 

' • t » 



23 



\ -20- 

replicates was reduced from 216 to 162.' Important primary sample charac- 
teristics from each of these time periods are discussed iiv the sections 
which follow. Frequent re*ference$ is made to table 2-3. 

2.3.1 „ Year 01 . . * 

Sample allocation to states was not controlled in Year 01, and as a 
result, 38 out of the 50, states were included in the sample. The sample 

■4 * 

was also characterized by having 208 replicates and 4.50 9-year-old schools 
assessed per replicate, while 3.60 and 3.22 13- and 17-year-old schools 
were assessed per replicate, "(see table 2-3), The average group package 

sam^e size was 2435, 2414 and 2123 for 9-, 13-, and 17-year-olds * compared 

4 f 

to a targeted value of 2496 (12 x 208). The average individual . package 

sample size yas ,1858, 1861, and 1722 for 9 f s, 13' s, and 17 ! s compared to a 
targetted value of 2080 (10 x 208). 

2.3.2 Years 02 through 04 

Controlled selection was the method by which the Year 02 through 04 
primary samples were selected. Every) state and the District of Columbia 



we 



r</ represented in the sampfe every year. The samples were composed of 



216 replicates each year. The numbers of schools assessed per replicate 

per age varied from 3.51 to 4.76. The targeted ^group package sample size 

was 2,592 (12 • 216) and the actual average sample size ranged from 2,335 

to 2,706, For individual packages, the targete4 value was 2,160 (IX) • 216) 

/ 

and the actual values ranged from 2,136 to 2,318. 
2.3.3 Years 05 dnd 06 » * ' . 

Controlled selection was abandoned as .the method of primary sample 
selection in favoi: of a deeply stratified sample which fulfilled the all- 
state requirement and at the same time provided simple, relatively unbiased 
fcstimates^df variance. Samples in each year were composed of, 216 replicates 



.24 



A maximum number of schools was assessed per replicate in Year 05-5.77 for 
9-year-olds, 5/92 for 13-year-olds, and 4.87 for 17-year-olds. The average 1 
group package sample size r^rfged from 2,173 to 2,605. Individual packages 
were administered in Year 05 only and in every case the targeted sample 
size was met. 

2.3.4 Years 07 Through 10 * " ' 

The deeply stratif ie^feample which fulfilled the all-state requirement 

and provided simple, relatively unbiased estimates of variance* was extended 

to a four-year period. The number of students per group administration was 

increased from 12 to 16 'so that the number of replicates could be decreased 

from 216 to 162. Decreasing the number of replicates accounted for jl 

* 

sizeable reduction in field costs. Increasing the group size per< adminis- 
tration allowed the targeted sample sizes of 2,592 to be met. (i.e., ' 
216 -/I 2 = 162 * 16 = 2,592). The total numbers of schools selected per 
year was maintained at 1,600. In previous years, -about twice this number 
Qf schools was selected. Schools were kept to a minimum to reduce field 
costs. The numbers of schools assessed per replicate ranged from 2.54 to 
3.33 which was considerably beloV 'the level of . earlier years. Average 
* group package sample sizes varied from 2,235 to 2,771. One individual 
package was administered in Ye^r 07 at <(§e 9 and* the targete<3*'sample size 
was met. 




25 



c 



-22- ■ * * 

3. SAMPLING FRAME CONSTRUCTION 

i 

3.1 ' Sampling Frame Units '•**■' 

> ' .% ' * * " 

The units used for constructing the basic sampling frame file were the 

• ' f» 

counties and county-equivalent independent cities recognized *by the Census ' ; 

Bureau in 1970. Washington, D.C. was, included as a jingle frame-unit, ; 

though it is neither an independent city nor -county a$ in other states. 

Except for the Alaska portipn of the frame, there was, one sampling frame* 

unit for each 1970 county and county equivalent. ^ ■ " ^* 

For the Alaska portion of the frame, " the. two, largest Census Divisions 

(country equivalent), Anchorage Census Divisiplf and Fairbanlfs Census Divi-' ' 4 

sion, each comprised a frame unit, since each had a sizeable city and was % • V 

• ' * . t ' 

reasonably compact and accessible. These two units, alone ^contained 5*6 

• : 




percent of testate's population in 1970. The* third frame unit\£or Alaska, 
j was ■ comprised^of the Juneau Census Divisiop and 21 specific places. All 

the included places had 1970 populations of 1,000 or more, or a*£ in close . m * N 
proximity to such a Jjlace, .and are accessible via regular a/r transporta- 
tion. In total, the three Alaska frame units contairi 75.7 



percent of the 
•-state's 1970 population. 

The_samplihg frame was comprised of a total of 3,115 basic units. 
3 . 2 gampifeg Frame VariabJ.es and Data Sources * « 

A data record Was compiled for each samp£ing'^^me unit consisting of 
33 primary variable^ representing identification and descriptive data. 
Additional size measure and " stratification variables were co/hputed or 
derived from the. primary data and ^dded* to the data records. Following is 
a description of each frame variable, including the source of 'the <data and 
\ * estimation procedure, if applicable. Variables are listed alphabetically 
by SAS name. ' . - ; - 



V 



-23- 



AGE 9 - Est^ted -9-year-olds, 1977-78 * ' 

The number of 9-year-qj.ds enrolled in the "county in 1977-78 was' esti- 

mated as follows: * \# ' 

x < '* 

AGE 9^ = .0082 (2nd grade enrollment) + .2386 (3rd grade enrollment) 

+ *7387 £4th grade enrollment) + .0051 (5th grade enrollment). : 
Grade enrollments were obtained by summarizing to the county level Curricu- 
lum Information Renter's 1977-78 scjjaoql-level grade-by-gfade data ^for 
public, Catholic and other private schools. « The proportions of . 9-year-olds 
among the grade's enrQllments/ the coefficients in^the computation formula, 
were estimated using weighted National Assessment data and October 1975 
year-by-grade populatiofi estimates from Current Population Reports. 
AGE 13 - Estimated 13-year-olds, 1977-78 

JJfre number of 13-year-olds enrolled in the county in 1977-78 .was 

*~ J* 
estimated as follows: . * 

4 - AGE 43 = ..$231 (6th grade* enrollment) + .2314 (7th grade enrollment) 
+ ,6954 (8tl\ grade enrollment) + .0036 (9th grade enrollment). 

.Grade enrollment^-and coefficients were 'determined^as described for AGE 9. 

• AGE 17 - Estimated 17-year-olds, 1977-78 ^ . ' 

The . number of 17-year-olds enrolled**in *the county in 1977-78 was 

estimated as follows: 

AGE 17 ^ .0148 ( 9th grade enrollment) + .1345 '(10th gsade enrollment)* 
- , + .7896 (11th grade enrollment) + .1203 (12th grade enrollment). 

Gtade enrollments and coefficients were determined as described for AGE 9. 4 

„ ASIANPOP - Asian Population, 1970 # %\ . . 

The /source of the 1970 Asian population count -was the "o^ier specified 

races" item of Tabulation 20 'of th/j 1970 Census First Count" tapes, File B 

Ccounty level summary r^prds)'. As described in Census User's Guide, Part 

II, "other specified races 11 includes, specifically, Japanese, ChiQese, 

t Filipino, Hawaiian and Korean. For Alaska, Hawaiian and Kprean are. replaced s 



by Aleut' and Eskimo. (In other states, Aleut^and Eskimo are not included 
in the Asian count). 

* 

ASI2E - Estimated Asian Size Measure « 

The estimated Asian population size measj/re was computed as follows: 

'ASIZE - Asian population, 1970 /Average number of estimated enrolled \ 
Total population, 1970 ^9<'s, 13's and 17's, 1977-78 j 

* , 

■ _ ASIANPOP • 

T0TP0P ' ' • 

A description of ternjs of the expression may be found in this list by 

» 

referring to the SAS variable names given. 
ASTATE - Postal Abbreviation for 'the State 

The Postal abbreviation is a two-letter state identification code. 
For the sampling frame file, these codes were taken from the Clt school 
data file previously described, ^ ~ 
BIASCHLSf - Bureau of Indian Affairs Schools in County 

A data tape containing names and addresses of approximately 200 Bureau 

of Indian Affairs schools was received from CIC in the Summer of 1973. The. 
* ft 

address- ZIP code was used to determine each school's county, and the number 

t ' **" 

of included'BIA schools was tabulated for e^ch county- 
BLACKPOP^JBlack Population, 1970 , ■ N ' 

f * 

The source of the 1970 Black population count was the. ! Negro ! race 
item of -Tabulation 20 of the 1970 Census First .Count tapes, File B (county 

' ft 

level summary records). Census User's Guide, Part II states: "Negro.. -In- 
cludes persons who indicated their race as 'Negro or Black. 1 Also includes 
persons who indicated the ' other race' category and furnished a written 
entry that should be classified as "Negro or Black. 1 " 



. BSIZE - Jfrstiroated Black Size Measure _ 

T&e^ es tinted Black' populatfon size measure was computed as follows: 



BSIZE - Black population, 1970 / Average number of estimated enrolled 

Total population, ,1970 * \9's, 13's, 17 f s, 1977-78 



= BLACKPOP . 

TOTPOP * v 



A description of the terms of the expression may'be found in this list by 
referring to the SA(3 "variable names given. 
CENDIV - Census Geographic Division 

The Cehsus geographic division containing the state is designated by a 
one-digit' code, as follows: 

1 - New England 5 - West South Central fh 

2 - Middle Atlantic 6 - East North Cehtral 

3 - South Atlantic s .7 - West North' Central 
< 

4 - East South Central 8 - Mounta^j ^ 

9 - Pacific 

CESAREA - Census Employment Survey Area Population i 

As part of the Census, data is published for low income areas (called 
Qensus Employment Survey [CES] areas) w^tKin selected large cities . In 
1970^ tfref Census Bureau. 1 defined clust^^-of census tracts in 40 of the 65 
largest Standard Metropolitan Statistical Areas (SMSA) as CES areas, These 
are areas with high percentages of Blacks, high percentages of poverty 
families, high unemployment jrates, and* low- percentages of professional 
workers* RTI has similarly identified compact groups of inner city Census 
tracts in 25 additional SMSA's so that CES-type areas are defined in all of 
the 65 largest SMSAs. A data file has been constructed by RTI of identi- 
fication and descriptive information for each tract in the 'CES area of the 



65 largest SMSAs. The data for this file was extracted from 1970 Fi^st 

Count Files, as described in the 1970 Census User's -Guide, Part II. The 

total population of CES area Census tracts were summed to the cbunty level 

and these qpunts were added to the NAEP sampling frame records. . 

COMSIZE - Community Size Stratum r ' 

The Community Size Stratum is designed by a one-digit code, defined as 

follows: V , / » 

1 - SMSA couSfe.es containing all or part of 5 central, city ("big 
city") with 200,000 or more population in 1#70. 

2- Remaining counties f in ."big city" SMSA's, i.e., SMSAIs having-* 
' • central cities with 200,000 or morrf population, in 1970. 

3 - SMSA counties* containing, a central city or other place of 25,000 

' or more population in 1970, .but not in a "big city" SMSA. 

4 - SMSA counties not containing a central city or other place of 

25,000 ot more population, and ifot in a "big city" SMgA* 

•5 - Non-SMSA counties containing all or •pkrt of a place with 25*000 
or more population in 1970/ • L 

6 - Non-SMSA counties with total urban" population -o5 10,000 or more 

in 1970, but not having a place of 25,000 o-r more population. « 

s 

7 - Non-SMSA counties with £ total urban population of less than* 

1*0,000 in 1970 and not containing any portion of a' place of 
25,000 or more population in 1970. 

COUNTY - 1970 Census County- Code v * 

Within each state,; counties or county equivalents are identified by a 

unique three-digit code- assigned by the Census Bureau' as part of the 1970 

geographic cod£ scheme. ' The code scheme may be found in various Census 

publications, e.g.* \ FIEg.PUB 6-1. The source-of the numeric codes for the 

sampling frame file was the.CiC school data file. The Alaska frame unit 

representing, Collectively , 21, specific places and the Juneau Census Divi- V 

sion was arbitrarily assigned a county code of 999. 



-27- 



CSIZE. - Estimated Size Measure Within CES ftrea , 

The estimated size measure (average of 9-, 13-. and 17-year-old enroll- 
*ment, 1977-78) within the Cens^te Employment Survey areas was computed as 
follows: 

CSIZE - CES area population, 1970 /Average number of e^imated en- \ 

'Total population, 1570 * (rolled 9 f s, 13 f s, & 17* s, 1977-78 J 

= CESAREA 
/ ■ T0TP0P " 

m 

A description of the terms of the expression may be found in this list by 

referring'*?? the SAS variable names given. 

HISPOP - Maximum of Hispanic -Indicators 

Three Hispanic indicators were* formed for each county or county equi- 

o valent from data of JTable 24, Census 4th count file, as follows: 

HI = Number of persons classified in any of the five Spanish 
categories of the questiorf on "origin or descent" (5 percent 
sample).. ■ ^ 

H£ = Number of persons of Puerto Rican birth or parentage (15 
* ; percent sample) . 

H3 = . Number of persons of "Spanish language" and, in the five 
Sbuthwestern States (Arizona, t California, Colorado, New 
^ Mexico atfd Texas) persons not of Spanish language but of 
Spanish surname (15 percent sample). 

* The definitions of ealch of the§e categories may be found in a number of 

Census publications, including General Social and Economic Characteristics , 

t 

United States Summary , Appendix B. < 

The maximum of the three values, HI, H2 and H3, for each county and 
county equivalent was added to the NAEP sampling frame file, as the variable 
HISPOP. " 



■ . 31 



ERJC • • . 



ERIC 



-28- 



HSlZE'^- Estimated Hispanic Size Measure 

The estimated Hispanic population size measure was computed as follows: 
,1 

„. . , r '. v ^o-7a /Average number of esti- 

HSIZE = Hispanic population (maximum) , 1970 [ mated enrolled 9 ! s, 13 * j 

Total population, 1970 \ & ^ 1977 _ 78b 

^ HISPOP ' 
TOTPOP ' 



A description of the terras of the expression may be found in this list, by 
referring to the SAS variable names given. 
INDIAN - American Indian Population, 1970 

The American Indian population £or the county or county equivalent was 
taken from a file constructed using data of 1970 Census First Count Tapes, 
File B, t Table 20. For a description of this data source, see the 1970 
Census User's Guide, Part JI . 
ISIZE - Estimated Iadian g^Lze Measure 



follows 



The estimated American Indian population size measure was computed ^as 8 



A . t j- t J- no-»rt /Average number of estimated 

ISIZE = American Indian population, 1970 f enrolled 9's Ws &l?'s 

Total population, 1970 ' I 1977-73 



INDIAN 
TOTPOP ' . 



A description of the terms of the expression may be found fh this list by 

J p 

I 

referring to the SAS variable names given. 
LATITUDE - County 1970 Population Center Latitude 

The latitude of the' computed location ol the countyls 1970 center of 
population was taken from £ Census Bureau data tape available through. 
Triangle Universities Computing Center (TUCC) . The computed population 
center latitude is expressed in decimal degrees. 



32 



-29- 



LONGITUDE - County 1970 Population Center Longitude ^ 

See LATITUDE; for a description of the sour<^e of this data. * 

LU - Listing Unit Number 

Thirty-eight independent cities in^ Virginia * had county-equivalent 

stattre>at the time of the 1970 Census. When the frame unit file was first 

established, it was recognized that due' to their small sizes, many of these 

"Independent cities would ultimately need to\ be "combined with some other 

* * * \ ■ 

unit(s) to form final sampling units. To facilitate this expected combina- 

*' * '\ 

tional process, every independent city was grouped with a county unit, and 

/ \ 

all frame unifc^— af a grouping were assigned tne same four-digit listing 
unit* number. 

NAEPREG - Office of Business Economics Region 

\\ 

i National Assessment reporting regions coincide with the Office of 

* \\ » 



Business Economics Regions, and these are designed >i 
follows: 



^y a onerdigit code, as 



1 - Northeast * 3 - Central 

* * 2 - Southeast * 4 - West 

NAME - County Name • * 

The county or county equivalent name was taken |from the CIC school 
file, fhe Alaska frame unit representing, collectivel^L 21 specific places 



4 

and the Juneau Census Division was labeled "Alaska bala 



ice. 




NSTATE - 1970 Census' State Code X 

The two-digit 1970 Census state code (numeric) was taken from a speki- 
% 7 

ally prepared SAS data set linking the state alphabetic ' [ode , state numeric 
' code arid other geographic identifiers. * 
OVERSIZE - \Oversampling Size Measures 

National Assessment -directed that low-income, inner-city areas (CES 
^reas) and extreme rural areas be oversampled to ensure adequate sample 



\ 



,33 



sizes for Blacks and rural students to permit reporting of results for 
these subpopulations. To facilitate the oversampling of these areas at 
twice the rate of all other areas, an oversampling size measure was com- 
puted for each f^rame unit, as follows:. 

OVERSIZE = Frame unit size measure + CES area size measure 
+ extreme rural size * 

= SIZE + SCIZE + RURSIZE . 



The effect of the indicated computation is to double the size measure for 

the CES areas and extreme rural areas, (Note: By^the manner of their 

definition, CES areas «and extreme rural areas never occur in the same frame 
■ * 
unit.) 51 ( 

POP200K - County Population in Cities Over 200,000 

The xounty population in cities over 200,000 population in 1970 was 
9 

summarized from .the 1972 County and City Data Book tape file, 
POP25K - County Population in^Cities Over ,25,000 ^ 

The county population in cities over 25,000 population in 1970 was 

1 v» t 

I 

suirabariaed from the 1972 County and City Data" B6ok tape file. 
PQOCCA - Professional, Technical and Managerial .Workers 

The* county 1970 empjtoyhient in 'major occupational categories: (1) 
professional, technical and kindred workers,, and (2) managers and adminis- 
trators,- except farm, was summarized from Census 4th Count files Table 68. 
This corresponds* to NAEP Principal's Questionnaire occupational category^ A, 
PQOCCB - Sales, Clerical, Foremen anid Skilled Workers 

The county 1970 employment in major occupational categories: (1) 
sales workers, (2) clerical anfl kindred workers, (3) craftsm^if, foVemen, 
and kindred workers, was summarized from Census 4th count fil^s, Table 68. 
This Corresponds^© "NAEP 1 Principal 1 s Questionnaire occupation catego/y B. 



34 



V 



-31- 



PQOCCC - Blue Collar, Service and Private Household Workers 

The county 1970 employment iti major occupational categories: 
operatives , except transport , (2) transport equipment operatives , (3) 
laborers, except farm,* (4) service workers, except private household, and 
(5) private household ^workers, was summarized from Census 4th count files, 



Table (>£. This corresponds to NAEP's Principal's Questionnaire occupation 

category C. ~* , ~ m * 

\ A 
PQOCCP ,-^Farm Workers / 

The county 1970 employment in major occupational categories: (1) 
farmers and farm managers, and (2) farm laborers and foremer/was summarized 
from Census 4tfi count files, Table 68. This corresponds to NAEP Princi- 
pal s Questionnaire occupation category D. * 
PQOCCE - Unemployed Persons in Labor Force, 1970 

The county's number of unemployed members <ff the labor force in 1970 
was computed using data from the 1972 County and City Qat& Book tape file, 
as follows: 

> 

PQOCCE = Civilian labor force, 16 years old and over x percent un- ' 
employed of the civilian labor force x .01. 

This corresponds to NAEP Principal's Questionnaire occupation category E. 
PQOCCF - Recipients of OAA and AFDC, Feb. 1972 • 

The number of recipients of old age assistance 'and aid to families 
with dependent children in February 1972 was summarized from data in the 
1972 County and City Data Book tape file. For some counties and county 
equivalent^, data were not available separately, but were presented, in \ 
combination with other units. Missing data w£re estimated by prorating the 
f combined » OAA/ AFDC' counts to counties in proportion to their total popula- 
tions.- . . * 



35 



-32- • . ~ .. 

RURALPQP - Rural Population, 1970 

The county rural population* in 1970 was sununarized from, the (Tensus 4th 
count files. 

RURSIZE - Extrem e Rural Size Measure 

: : — ~^ ,# / . 

The extreme rural population size measure was defined as follows: 



RURSIZE = 



SIZE if SDOC = 5, i.e., if frame unit 'description 1 of 
community is extreme rural; 



0 otherwise . 

SCHL9 - Schools With Grades 2, 3, 4, or 5 

The niitaber of schools having any of the grades containing 9-year-olds 

(grades 2, 3, 4 or 5) was obtained by summarizing Curriculum Information 

•Centers 1977-78 file of public, Catholic and other private schools. 

SCHL13 - Schools Vi^h Grades 6, 7, 8, or 9 

The number of schools having any of the grades containing 13-year-olds 

(grades 6, 7, 8 or 9) was obtained by summarizing Curriculum Information 

Center's 1977-78 file of public, Catholic and other privat^schools . 

.SCHL17 - Schools With Gr adafe 9, 10, 11 or 12 

— ~s — ^ — * 

The number of sd^ols -having any of-the grades captaining 17-year-olds 

(grades 10, \ypx 12) was obtained by summarizing Curriculum Information 

Center's 1977-75 file of public, Catholic and other private schools. 

o 

SDOC - Sampling Description of Community 

. _ — ; \ % 

The sampling description -of » community classification represents a 
recoding of the ^community size strata* (COMSIZE) as shown on the following' 
page. \ " \ ,% 

As shown by the table, the counties with COMSIZE codes of 7, i.e., 
non-SMSA counties with urban populations le^ss than 10,000 and with* no 
portion of a city of 25,000 or more, were partitioned into two sets — ah * 
'extreme rural 1 set and a f non-fextreme rural 1 sbt prior to assigning SDOC 



-33- 



SDOC Category 



Includes COMSIZE Codes 
i 



1 
2 
3 
A 



i 

2 
3,5 



A, 6, 7 ('non-e5p/reme rural 1 - 
counties) 

7 ('extreme rural 1 counties) 



codes. ISentification of. the 'extreme rural 1 set* was done in several 
steps. F.irst, counties without farm employment (PQQCCD = 0) were identi- 
fied and defined^to be 'non-extreme rural 1 counties. For the remaining 
counties having COMSIZE codes of 7, an 'extreme rural' index was computed 
as follows: 

PQQCCD - PQOCCG - 2(PQ0CCA) 



Extreme Rural Inde 




PQOCCA + PQOCCB + PQOCCC + PQOCCD + PQOCCE 



ERLC 



A high value of the index indicates a relatively high proportion of f^rm 
workers in the county labor force and a relatively low proportion of pro- 
fessional, technical , and managerial workers and . of factory and other 
blue-collar and service workers. The counties were ranked on the index 
from highest value to lowest value, and the extreme rural counties were 
identified as those having an index value of -0.607 or greater. In the 
northeast, an index value of -0.681 was required to allow an allocation of 
at least 1 replicate per annual sample. Given that extreme rural is to be 
sampled at a rate twice that of non-extreme areas, this definition assures 
~ • that 10 percent of the sajnple will )be extreme rural. Thus, non^-SMSA 
counties with total urban population of less than 10,000 in 1970, not 
containing any portion of a place of 25,000 or more population in 1970, and 



possessing a large enough extreme rural index to insure that 10 percent of 



37 




the • sample would - toe extreme rural were classified' as SD0C5. Remaining 
npn-SMSA counties with urban populations less than 10,000 and with no 
portion of a city of 25,000 or more were classified 'as SD0C4. In the 
northeast s£nce a different extreme rural index cut-off point was required, t 
the categoii^were called SD0C6 and 7. 
SIZE - Average of Estimated 9's, 13's and 17's, 1977-78 ' 

The basic size measure for each frame unit was computed as the average 
of estimated 1977-78 9-, 13-, and 17-year-old enrollments, as follows 

SIZE = (AGE9 + AGEi3 + AGE17)/3 . 
SMSA - 1970 SMSA Code . 



7 



The 1970 four-digit SMSA codes were taken from county summary records 
of the Census First Count files. In New England, counties containing more 
than onp SMSA were assigned the code of the predominant SMSA, and counties 
with less than 50 percent urban population y^ere not assigned SMSA codes. 
SMSA77 - 1977 SMSA or N<fa England County Metropolitan Area (NECMA) Code 

The current SMSAs and, codes were taken from the publication Standard 
Metropolitan Statistical Areas/ 1975 and subsequent 0MB Information' Of f ice 
releases. New England County Metropolitan Area codes were, taken from the 



same source. 



T0TP0P - Total Population, 1970 

The county total population was derived* from intermediate data from 
Census 4th count files, as follows: 

* - '' 

; T0TP0P = UtJban population + Rural population . 
Urban population was not retained as a separate item. 
3.3 Editing and Verification Procedures * *^ 

Numerous editing and verification procedures were , performed during 
dompilatioo of the sampling frame to ensure its accuracy and completeness. 



. 38 



Whenever possible, frame data were verified, 4Tlh£r directly or in summary 
form, by comparison to published data, usually Census reports. Discrepan- 
cies were investigated by tracing the frame da^a^through each .stage of its 
development from its origin. Inaccurate data were replaced either b£ 
developing correct data for all Records from the source and merging to the ^ 

frame file or by selectively correcting the file using direct interactive 

/ y 

! + 

editing procedures. 



e follojgicfg specific edits and verifications were performed: 

(A) The number of county equivalent frame fcaits for each State was 
verified to a count made from a listing of counties in a FIPS 
publication. 

(B) School count and age enrollment totals were obtained from the 
frame file and compared for reasonableness with 'data from the 
1975 Current Population Survey and 1976 Digest of Educational 
Statistics. 

s 

(C) The number of 1970^ Standard Metropolitan Statistical Area (SM^£) 
counties by state was ^tabulated from the frame file and verified 
using published Census information. 

(D) The 1970 SMSA codes represented on the frame file were lasted- 

, numerically and verified to a published Census list. • ^ 

(E^f The 1977 SMSA counties were listed, by SMSA, from the frame file 
\ and the listing was verified to the source document, Standard 
> \ Metropolitan Statistical Areas, 1975 . 

(F) State totals weife obtained from the frame file for total popula- 
tion, rural population and Black population, and these were 
checked against published Census data. Only minor -differences 
-«pp*were found. ' 



(G) For the state of Virginia, a 100 percent check was made of master 
file county total population and rural pdpulat^ion against publish- 
ed Census data/ Only minor differences were noted. A complete 
check was also made of county population in places of 25,000 or 
more and places^ of 200,60.0 or more. Arbitrarily selectjed 
counties were checked for correctness of employment by occupation 
totals, the unemployment count, and Old "Age Assistance and Aid 
for Dependent Children Recipient count. These - county Jevel 
checks showe'd that erroneous . data wre present in the frame fiiie 
for some variables. Correct data were obtained and substituted. 



(H) 



A' Statistical Analysis Systenf (SAS) procedure, DATACHK, was used 
to identify and list the five largest and five smallest values ot 
each frame file*" descriptive variable. These extreme data were 
verified individually- against' published "dafta. — 



-36- 



; < 4. STRATIFICATION 



4.1 Overview ' ' ■ ' » t : ' " >, 

■» ! • I 



r 



NAEP and RTI staff agreed that the primary sample selection in Year 11. 
would be a coordinated four-year sample. The discissions which preceded' 
the design of this sample brought to tight a number of sampling objectives. 
These objectives and the sampling approach to implement' them are discussed 
in the sections whicft follow. . |» 

A.I.J Sample Design Objectives ^ 

A major objective of the four-year primary sample beginning in Year 11., 

•yas to insure that at- least one PSU was present in each^jregion by size of 

community category annually. In previous primary samples, this control had 

not been maintained, and as a result, the numbers of sample respondents in 

size and type of community (STOC) cells were not stable from year-to-year.- 

Exti^me fluctuations were noted when region was crosjted-vith STOC.^^/^ 

Another major objective was to insure that each state and the District 

of Columbia was represented at least once in the four-year primary ^sample . 

V — •» " * 

It., wai also desired to*;have the sample be as widely dispersed as possible 

the four-year period. Basically, controlled selecj^isn was liked for 

its sample control but not liked for its complicated, biaseU^estimates of 

va'riance. 

A third design objective was 'to reduce, the geographic size of PSUs^ 
This modification would have th^^ffect of (1) reducing field costs as well 
as aiding the field staff and (2) • reducing the dumber' reselected dis- 
tricts. Since PSUs are selected in proportion to population, larger geo- 
graphic areas* ate selected more frequently since, in\jen£tal, they contaiii 
a' larger papulation. Although control is maintained . so that no school* is 



40 



I.. ■ - 



•-37-. 



selepted moire than once in a four-year period, no such control is A exercised 
*oVer districts. " 

Redefining sampling size of community to more closely align withysize 
and type-of community (fefinitions was.a fourth, design objective. 

Objective five concerned the target population which consisted of 
9-year-ol<}s> 13-y£ar-olds , and 17-year-olds enrolled in school as w^l as 
17*-year-<g3i^who- were dropouts and early graduates. In Year ^1, 9-year-^ 
olds and 13-yeaf-olds were defined as individuals born during calendar 
yeyars 1970 and 1966, respectively;' 17-year-olds "were defined as persons 
. born between October 1, 1962 and September 30, 1963. 

V 

In order to insure adequate sample -representation for the reporting 

0 

subpopulations , low income and extreme rural areas will.be oversampled -as * 
the sixth objective- • ^ 

Objective seven states that a school will appear in the sample no more - 
than once every four -years. A school ^ifiay^appear^ in. the sample for more 
than one age. However, When this situation occurs, it must happen in the 
same assessment year. Also schools appearing in the Yiear 10 sample will be 
excluded from the Year 11* sample. . . v . 4 

An eighth design objective, concerned estimates of sample variance 
which were simple and relatively unbiased. 

* -The last objective stated that each annual sample be able to accommo- 
date, either, 75 PSUs with 550 schools at each age level or 100 PSUs with 
1000 schools at each, age level. } 
'• 4.1.2 Implementing These Objectives 
• Ifl order to implement the design objectives stated in. section 4.1.1, a 

4 • 

highly stratified four-year primary sample was developed. In response to 
objective one,, a single sample of 162 replicates was allocated to region 



41 




and siz6 of community categories in proportion to adjusted average pumbers 

of, 9-, 13-, and 17-year-olds in each class. The single sample allocation 

was multiplied by the total number of samples to determine the total alio- 
« 

cation (see' table 5-1). The single and total sample allegations were then 

v • * -* 

translated into numbers of one-, two-, and three-replicate units (see ta 

£-2). This procedure ensured that each region by* size of cohumun 
gory was represented in each sample. The s^cific procedure is discussed* 
further in section 4.2. 

To represent each state f and the District of Columbia in each sample 
and to disperse the^'sample as widely as possible, for each region by com- 
munity category, , the sampling frame was ordered in a serpentine fashion and 

equal sized zones were formed to accommodate the region by size of communi- 

i i-" * <± 

ty allocation. One sampling unit was selected from each zone thus insuring 

» \ • 

a^wide dispersion of the sample as Well as representing each state over the* 

s. <■ ' 

total sample. 

i ' 

In response to objective three, Standard Metropolitan Statistical 
Areas (SMSAs) were abandoned as primary sampling units. Instead single 
counties were used to define PSUs. Counties estimated to contgg.n fewer 4 
than 1,500 average 9-, 13-, and 17-year-olds were grouped with near neigh-' 
bor counties in the same state and community category until a minimum size 
of 1,500 was achieved. 

'Sampling description of community. (SDOC) was developed to more closely 
align sampling size bf community with STOC as stated in objective four. 
SDOC is discussed a^d defined in section 5.2, t 

The target populations defined in objective five were observed.^ 
* In resgonse to objective siff^ the Census Employment iSurvey (CES) low 
income areas .were used to define and oversample low metropolitan areas in 



» * 



A 2 



40 Standard Metropolitan Statistical Areas (SMSAs) where such tracts* had 
been identified. For the 25 cities among the largest 65 SMSAs where CES 
reas were not defined, compact groups of ijiner city Census tracts with low 
income characteristics similar to the CES "areas- were defined. The extreme 
rural subpopulation was defined, anti oversampled, as those counties classi- 
fied as SD0C5. , 

^Pbj^ctlve" seven was met and no school will appear in th€ sample more 
than once every four years. 

The eighth objective concerning simple, relatively unbiased estimates 
of variance was met by selecting independent school samples for each repli- 
cate within each PSU. Single-replicate PSUs were paired with another 
single unit or double unit in the same region and size of community cate- 
gory. With two, or three, primary units per stratum, simple squared differ- 
ences provide direct estimates of 'the variance among PSUs within strata.' 
The variance of NAE^ proportion correct (P-value) ratio estimators and 
other related nonlinear statistics, such as "Raw" and "Balanced" change in 
proportion correct^ (AP-values) , can be approximated by forming squared 
differences between appropriate Jackknife pseudovalues . -****\ p 

Accommodating the last objective of 75 PSUs* with 550 schools per agfe' 
class or 100 PSUs with 1000 schools per age class is discussed in section 
6. Briefly, the objective was met by defining a fifth primary sample which 
could be used for the, dual purposes of (1) augmenting the 75 PSU primary 
sample up to the 100 PSU level or (2) providing ireplacement PSUs for "those 
which refuse. 

* . 'j 

4.2 Sample Allocation by Region and SDOC Category 

As noted in section 4.1, a major objective of the Year 11 primary- 
sample was the selection of at least one PSU annually from each region by 



43 



^ size of v community category in order to reduce annual fluctuations in 
i^^mijpbers of sample respondents by STOC categories. This objective was met 

by allocating a single -sample of 162 replicates in 'proportion to a measure 

^ • ^* \ 

„ of size for each region by SDOC category. SDOC categories are^d&fined in 

section 3.2. The measlire of size was the average number of 9-, 13-, ! ancj 

17-year-olds counting those in inner city- and extreme rural areas twice. 

Inner city (CESAREA) and extreme rural (RURSIZE) areas are defined in 

section 3.2. The size measure for each region by SDOC category as well as 

the proportional allocation of 162 replicates in* fractional and integer 

form is shown in table 5-1. As noted' in section 3.2, it was necessary to 

increase the extreme rural index cut-off for the Northeast from^-0.607 

-0.681 to allow an allocation of at least one replicate for the single\ 

* i * 

sample. By this procedure SD0C6 and 7 (comparable to SD0C4 and 5) were 

, i * \ 1 

defined in the Northeast. The integer single-sample allocation was multi- 

plied by 5 in t^b^e 4-1° to obtain the total sample alltation. 

In table 4-2, the single and total simple allocations of. 162 and 810 

replicates, "respectively, were partitioned into 1-, 2-, and 3-replicate 

units. In regi^m 1 and SD0C1 category, 13 replicates were to be selected 

for a single sample which translates into, 5 2-replicate units and 1 3-rep- 

" 4 

licate miit^(5*2 ^1*3 = 13$* The total five-sample allocation, for this 
region and SD6C category, was 65 replicates (13 • 5) which translates into 
25 2-replicate units and 5 3-replicate units (5*5*2 + 5 ; 1*3 =,65)- 
4.3 Meeting the All-State Requirement Over a Four-Ye^r Period 

t In erder .to ensure that e&ch st^te and the Districts of Columbia were 

f 

♦ * 
• included in the sample over a' four year period and that the sample was 

4 widely dispersed, the frame was ordered in a serpentine geographic fashion 



Table 4-1. Sample allocation by region and SDOC categories -* 



* # ' Size Single-sample Integer single Five-sample 

Region SDOC " measure allocation sample allocation allocation 



•1 


1 


337,519 


12.67 


13 


65 




2 


231,294 


"8.68 - 


9 


45 




3 


321,465 


12.07' 


12 


• " 60 




6 - 


.127,115 


4.77 


5 


25 


■ • 


7 


' 20,769 


0.78 ' 


1 . 


5 






1,038,162 


38.97 


40 


200 


2 


.1 


Z' 171 , 171 


6.42 


6 


30 




2 


90,011 


3.38 


3 


15 




3 


272,331 


10.22 


10 


50 




4 


312,766 


11,74 


12 


60 


* 


5 


127,759 . 


4.80 


5 


25 


• 




974,038 


36.56 


- 36 


180 


3 • 


. # 


382,934 


14.37 


14 


70 




2 


* 186,151 


6.99 


7 


35 




3 


, 268,679 


10.08 


10 


50 




4 • 


188,897 


7.09 


7 


35 




5 


211,410 


7.94 ! 


a 


40 






1,238,071 


46.47 


• 46 


230 


4 


1 


496,084 


18.62 


19" 


95 




2 


78,696 


2.95 


3 


15 




3 


268,835 


. 10.09 


10 


50 




4 


138,779 


5.21 


5 » 


25 




5 • 


83,343 


3.13 


3- 


"> 15 






1,065,737- 


40.00 


40 


200 


TOTAL 




4,315,008 


162.00 


162 


8.10 



-^42- 



Table 4-2/* Allocation in terms of 1-, 2-,. and'3-replicate units 

r 



Region 
— 3 


- 

SDOC 


Singled 


ample 


allocation 


Five-sample 


allocation 


Total reps ; 


1-rep 


2-rep 

IT 


3-rep 


Total reps 


1-rep 2-rep 


3 "rep 


1 


- 

1 


- 

- 13. 




5 


1 


65. 




25 


5 




2 


9 ' * 


- 


3 


1 




- 


15 


- 5 




3 „ 


12 


- 


6 




60 




30 






6 


5 

9 


1 


2 




25 


5 


10 


- 




7 










5 


_5 










40 


2 


— 
16 




2 


200 


10 


80 




10 ' 




• 

1 


6 


- 


3 * 




30 


* — 


* 15 . 






2 


3 


1 


1 




. 15 


5 


5 




* 


3 


} 10 


- 


5 




50 




25 






4 


12 * 




6 


- 


60 




30 


- 




5 


5 




2 




25 


5 


10 








36 


2 


17 





180 


10 


85 




— 


3 


1 


14 


- 


7 




70' 




35 


** * 




2 


7 


- 


2 


1 


35 




10 


5 




3 


10 


- 


5 




50 




25 






4 


7 


.1 


3 




35 


c 


ID 






5 


O 




_4 




" 40 




20 








. 46 


I* 


91 


1 


ZjU f 


5 


105 


c 
J 


4 » 


1 


19 ' 




8- 


.1 


95 




40 


5 




2 


3 - 


^1 


1 




15 


5 


5 






3 


. 10. 




5 




50 




25 






4 


5 


1 


2 




25 


5 


10 - 






5 


3 


4 


1 




15 


5 


5 








« 40 


3 


" .17 


T 


200 


15 


" 85 


5 


TOTAL 




162 

a 


8 ' 


71 


4 


810 


40 


355 


20 



and equal sized zones were formed to accommodate the total sample alloca- 
tion for each region by SDOC category. One sampling unit was selected from 

each zone. Since thfe total sample allocation was 810 replicates or 415 

« - . 

units (40 3-rep + 355 2-rep + 20 1-rep") and the f rame was ordered in a 

systematic fashion and one unit only -was selected from each zone, the 

sample was assumed of having a. wide dispersion as well as meeting the 

all-state requirement. The total sample allocation by region, state, and 

SDOC is provided in table 4-3 while the serpentine ordering of states is 

<* * » • 

proyided 'in table 4-4. An example of how the serpentine ordering is 

applied to the sampling frame is provided in section A. 4. 2 for one region 
by SDOC category. 

4.4 Selecting the Sample |j ^* 

Befor^* the primary sample could be selected, all counties in * the 
United States had to be formed % into primary listing units. ,The listing 
units were then ordered, zones were formed, and the total sample was select- 
ed.; The total* sample was assigned' to years. Each of these topics is 

'discussed in the sections which fpllow. 

* * 
4.4.1 Form Listing Units % • 

In order for each PSU to contain enough population to accommodate the 

selection of 1,000 schools per age class, it? was determined that each PSU 
# 

must contain a minimum o£^l,500 average 9-, 13-, and 17-year-olds. A PSU J 
'was then defined as any county or giroup of near neighbor counties in the 
same state and of the same SDOC type with a total average number pS esti- 
mated 9-, 13-, and 17-year-olds of at least 1,500. 

Any county which had an estimated average number of 9-, 13-, and 
17-year-olds of at, least 1,500 was automatically classified as a PSU. A 
listing by state and SDOC of all counties whose estimated average number of 



47 



Table 4-3. Five-sample allocation by region, state T . and SDOC ) 



State SD0C1 SD0C2 SD0C3 SD0C6 SDOC 7 Total 

















Connecticut 




- 


9.70 


0.70 


- 


' 10.40 


District of Columbia 


3.28 










3 28 


Delaware { 






1.49 


0.70 




2.19 


Massachusetts 


2.78 


10.87 


5.87 


0.68 




20.20 


Maryland 


3.92 


9.69 


0.67 


1.39 


1.21 


16.88 


Ma ine 






1.50 


2 47 




? Q7 


New Hampshire 


— ■ 




1.32 


1 . 15 






New Jersey 


5.63 


10.64 


7.31 


2.45. 




26.03 


New York 


35.64 


3. 15 


18.94 


5.83 


1 40 


64 96 


Pennsylvania 


13.75 


10.08 


10.23 


8.28 


1.21 


43 55 


Rhode Island 






2.57 


0.46 . 




3.03 


Vermont 






0.40 


0.89 


1.18 


2.1*1 




65.00 


45 . 00 


60.00 


25 .00 


5.00 


200.00 








Southeast 


region 






"Alabama 


2.28 


0.48 


4'. 87 


4.9-2 


1.10 


13.65 


Arkansas 




0.17 


2.40 


3.34 


2.70 


8.61 


Florida • 


10.39 


1.05 


10.20 


3.75 


1.51 


^26.90 


Georgia 


3.83 


2.55 


3.51 


6.30 . 


3.03 


19.22 


Kentucky 


2.60 


' 0.99 


1.57 


4.16 


4.85 


14.17 


Louisiana . 


2.83 


1.74 


4.81 


5.06 


1.60 


16.04 


Mississippi ^ 




0.19 


2.86 


4.98 


1.89 


- 9.92 


North Carolina 


1.47,/, 


0.68 


7.14 


7.54 


3.44' 


20.27 


South Carolina 




4.21 


5.73 


0.93 


10.87 


Tennessee 


4.80 


1.12 


3.43 


4.82 


1*57 


15.74 


Virginia 


1.80' 


-~" 6.03 


2.80 


5.45 


2.08 


18.16 


West Virginia 






2.20 


3.95 


0.30 


6.45 




30.00 


15.00 


50.00 


60.00 


25.00 


180.00 

* 



(continued) 



Table 4-3. Five-sample allocation by region, state;-and SDOC (cont-) 



State ' SD0C1 SD0C2 SD0C3 . SD0C6 SD0C7 Total 



> Central region 



Iowa 


0 


.93. 




43' 


. 3 


63 . 


0 


.94 


7. 


85 


13 


78 


Illinois 


19 


.95 


<& 


59 


5 


85 


4 


.43 


3. 


92 


42. 


.74 


Indiana 


3 


.10 


2 


03 


7 


.86 


4 


.65 . 


2. 


25 


19 


89 


Kansas * 


. 1 


.31 


1 


63 


1 


31 


1 


.74 


3. 


03 


9 


02 


Michigan . 


9 


.53 


' 7. 


92 


9 


83 


6 


.52 


o. 


99 


34 


.79 


Minnesota 


5 


.24 


2. 


65 


2 


37 


2 


.36 


5. 


33"** 


17. 


95' 


Missouri 


5 


.87 


4. 


74 


1 


93 


2 


.69 


4. 


79 


20 


02 


North Dakota 










0 


81 


. 0 


.34 


1. 


96 ; 


3 


11 


Nebraska" 


1 


.63 


0. 


27 


0 


.70 


1 


.03 


. 3. 


43 


7 


06 


Ohio 


18 


.69 


4 


91 


8 


48' 


6 


74 


0. 


55 


39. 


37 


South Dakota 










0. 


74 


0 


56 


2. 


19 


3 


49 


Wisconsin 


3 


.75 


1. 


83 


6. 


49 


3 


00 


3. 


71 


18. 


78 




70 


.00 


35. 


00 


50. 


00 


35 


00 


40. 


00 


230. 


00 



West region 



Alaska 










0 


.65 ■ 


0 


.46 






i 


.11 


Arizona 


6 








0 


.49 


1 


.63 






8 


.59 


California 


' — " 44 


•95 . 


5. 


.16 


21 


.84 


3 


.73 


. 0 


.41 


76 


.09 


Colorado 




.61 


3 


.89 


2 


.07 


1 


.19 


1 


01 


.9 


.77 


Hawaii, 


- ^ i 


.05 






0 


.02/ 


0 


.04 






3 


.11 


Idaho 








1 


03 


1 


.10 


1 


55 


3 


.68 


Montana 










0 


89 


0 


.96 


" 1 


51 


3 


36 


New Mexico 


i 


.34 


0 


.07 


1 


08 • 


1 


.69 


0 


22 


4 


40 


J^evada 


* 








1 


69 


* 0 


.35 


0. 


08 


2 


12 


Oklahoma 


4 


.25 


0 


70- 


1 


15 


2 


.33 


1. 


99 


10 


42 


m Oregon 


2 


.44 


, 0 


69 


1 


92 


, 2 


.13 


0. 


66 


7 


84 


Texas 


26 


82 


2 


89 


8 


98 


5 


59 


6. 


25 


50 


53 


Utah 










3. 


'49 


0 


94 


0. 


25 


4. 


68 


Washington 


4 


07 


1. 


60 


*4. 


28 


2 


23 


0. 


52 


12. 


70 


Wyoming 










0. 


42 


0 


63 


0*. 


55 


. 1. 


60 




95 

♦ * 


00 . 


15. 


00 


50. 


00 . 


25 


00 


15. 


00 


200. 


00 


^ "TOTAL 


260. 


00 


110. 


00 . 


210. 


00 


145 


00 


85. 


00 


810. 


00 



49 



ft 

-46-* 



Table 4_4 # Serpentine ordering of states within region 

* t 

— _ — =^ 




Northeast 

1 .Maine 

2 New Hampshire 

3 Massachusetts 

4 Rhode Island 
5" .Connecticut 
6 New Jersey* 

7. Delaware 

8 District of Columbia 

9 Maryland 

10 Pennsylvania 

11 New York 

12 Vermont 



Southeast 
iw 1 " 

1 Mississippi 

2 Louisiana * 

3 Arkansas . 

4 Tennessee 

5 Kentucky 

6 West Virginia 

7 Virginia 



\ 



S North Carolina 



9 South Carolina 

10 Georgia 

11 Florida- 

12 Alabama 



Central 

1 Iowa 

2 Wisconsin • 

3 Michigan 

4 Ohio f ^ 1 

5 Indiana 

6 Illinois 

7 Missouri 

8 Kansas 

9 .Nebraska 

10 SouthKDakota 

11 North Dakota 

12 Minnesota 



West 

1 Alaska 

2 Montana 
-3- Wyoming 

4 Idaho 

5 Nevada 

6 Utah 

7 Colorado 

8 Oklfchfemi 

9 Texas 

10 New Mexico 

11 Arizona 

12 Hawaii 

13 California 

14 Oregon 

15 Washington 



50 




'ar-olds less than 1 ,500 wasf obtained* Those counties 



were mapped and near neighbors in the same state and SDOC category wfte 
manually aggregated until the total average number of 9 f s, I3 f s, and 17's 
in the group met or exceeded 1,500. 
.4.4.2 Order Frame 




Once the listing units were^ formed, they were ordered, within each 
region and SDOC category, in serpentine fashion by state as specified in 
table 4-4 and alternately within each state by increasing and then decreas- 
ing percent mii^rfxty. The alternating percent minority was obtained by 
renumbering states using the serpentine order and assigning a negative sign 
to the percent minority if the state was even. The ordering is illustrated 
in table 4-5 for the Urban Fringe (SD0C2) in the southeast region (region 
2), • . - • • 

The ordering shown in table 4-5 is a final ordering. To obtain a 
starting point for <this final order, listing units within each region and 
SDOC category were serpentine ordered by state and alternately within each 
state by increasing and decreasing percent minority. A random number was 
then^ selected between 1 and the total sample allocation. In the example, 
the number was 14.1010 (1^<U.1010 < 15). This number was located in the 
accumulated allocation and the order began there. For the example, the 
allocation 14.1010 occurred in the fourth Florida listing unit so the final 
ordering shown in table 4-5 began there. 

Noticd in table 4-5 ^.hat Louisiana, Kentucky, North Carolina and 
FArida which are the first third/ fifth, and seventh (i.e., odd numbered) 
states in the region by SDOC category have positive minority indices. 
Tennessee, Virginia, and Georgia which are second, fourth, and sixth (i.e., 
even numbered) .states have negative minority indices* The alternately 



51 



-48- 



Table 4-5. Illustration of serpentine ordering of sampling frame 





Primary 
sampling unit 


r — ^- 

Serpentine 
order 


Minority 
index 


. Sample 
allocation 


Cumulative 
allocation 


Florida 


2025 


11 


. 38.894 


2.1861 • 


2.1861. 


""Alabama 


2073 


12 


-33.928 
4 49/^18- 
-39.160 


1.1413 


3.3274 


Louisiana 
Tennessee 


2071 
2157 


2 
4 


1.4128 
1.6016 


4.7402 
6.3418 


Tennessee 


. 2037 


4 


-22.030 


0.7998 


7.1416 


■Kentucky 


, 2111 


5 


15.802 


1.2986 


8.4402 


Virginia 


. 276*0 


7 


-43.393 

* 


0.3989 


' 8.8391 


.Virginia 


. 3129 


7 


-31,579 


0.5005 


9.3396 ' 


North Carolina 


2119 ' 


8 


25.104 


0.7356 


10.0752 


Georgia 


2121 


' 10 


-40.639 


1.1972 


11.2724 


Geoiyjia 


20B9" 


10 


-15.202 


0.7213 , 


il .9937 


Florida 


2103 - 


11 


9.713 


0.9045 


12.8982 


Florida 


2031 


11 


15.464 


1.0849 


13.9831 


Florida 


2057 


11 


24.565 


1.0169. 


■ 15.0000 



r 



increasing and decreasing order^of the minority indices by state within the 
serpentine order, may^e seen by examining Georgia where the index is ia ~. 
decreasing -"order (ignore sign) and Florida Where the order is increasing. 

A. 4.3 ' Form Zones . • ' 

' I 

The total five-sample allocation for each region by SDOC; category ' 

shown in table 4-2 was formed. One-, two-, and three-replicate^zones were 

formed. "Specifically for the region 2 and SD0C2 example of table 4-3, the 
% * 

total five-samplte allocation of 15 replicates or 5 single-replicate units 
and 5 double-replicate units as noted in table 5-2 wa^ apportioned among 
the primary sampling units in proportion to the adjusted^average numbers of 



• 9-, ^13-, and 17-year-olds. Adjusted implies that those populations in the 
oversampled areas were counted twice. /This sample allocation and', the 
accumulated allocation are shown in table 4-5. V 

A total of ten zones were zormfed for tli^f example consisting of 5 
double-replicate zones folio'wed b$^S single-replicate zones. Each double- 
replicate zone consisted of a sample allocation of 2. Sample allocation 
was accumulated down the ordered list until a cumulative allocation of . 2 
was obtained. Thus, tfie first zone was composed entirely of Florida PSU 
2025. Zone 2 consisted of 0.1861 of Florida PSU 2025, all of Alabama PSU 
2073 whose allocation was 1^1413, and 0.6726 replicates of Louisiana PSU 
2071. Zone 2 then contained a sample allocation of 2 (0.1861 + 1.1413 + 
6726). The^remainder of the Louisiana PSU 2071 (i.e„, .7402 = 1.4128 - 
0.6726) was included in zone 3. This procedure/continued until 5 double- 
^replicate zones were formed. The total allocation to these zones was 
10 (5 x 2) and they included all units through North Carolina' PSU 2119 
except for 0.0752 of the North Carolina unit. From each zone -th<*se formed, 

^6ne unit (either a single-, double-, or triple-replicate unit) was selected. 



53 



I 



-50- 



The specific algorithm f^r ^selecting *these units' is described in section 6. 
The units were selected *with probability proportional to the adjusted 
numbers of 9-, 13-, and 17-year-olds. ^\ 

The 0.0752 of the North Carolina unit as well'^as .the- 5 units at thb 
end of the list were formed intp 5 single-replicate zones using the^pro- 
cedure described in the previous paragraph except the cumulative sample 
allocation used to define a zone was 1 instead of 2. Thus, the first 
single-replicate zone consisted of 0.0752 of North Carolina PSU 2119 and 
0.9248^ of Georgia PSU 2121. TJ^Temainde^pf the GeorgiaJPSil 2121 (i.e/, 
0.2724 =1.1972 - 0.9248) was included in single-replicate zone 2. 

■ When the region by SD0C allocation consisted of double- and triple- 
replicates, the double-replicate zones were formed first followed by the 
triple-replicate zones. ' f 

4.4.4 Select Sample 

For the example of\tabif J 5-5, ) a 2-replicate unit was selected from 



each of the 5 double-replic&'te zones, and a 1-replicate unit was selected 
each of thfe 5 single-replicate zones: Examining^ the way the zones wre 
d in sectin 5.4*3, the first unit Florida PSU 2025 was assured by 
beidg selected in double-repliate zone 1 and- had a small probability of 
ing again selected in zone 2. ^Its probability of being selected in zone 




2 tfas 0.0931. ( ) - 



2 < ^ 

Similarly, North Carolina PSU 2119 was included in both double^repli- 

cate zone 5 and single-replicate zone 1, Its probability of being sheeted 

in zone 5 was 0.3300 [(.7352 - .0752V2] and in zone 1 was 0.0752 ( ). 

4;4*5 Assign Selected Units to Years 

The entire five sample allocation was selected and^a^sigAed systema- 
tically to single samples. The samples were denoted by the numbers 4 



* through 5 with 1 through 4 signifying Years 11 through 14, respectively, 

« < 

and 5 signifying 'Ihe replacement, and large* sample o^ion (caile3^the suppie- 
mental year) . 

After the, single units were selected by ^one, random permutations of 
the integers 1 through 5 were repeatedly generated and used to assign the 
units to years. The assignment is shown 'in table 4-6 for the example in 
table 5-5. The first ten selected units are the double-replicate units and 
ttie last five are »the single-replicates. Florida PSU 2025 **as selected 
twice as were Tennessee PSU 2157 # and Kentucky PSU 2111. ^ . 

Ttjree permutations of the integers from 1 to 5 were generated (i.e., 

1-5-3-4-2, 5-2-3-4-1, and 4-%-l-2-5) to assign the selected units to years,. 

Thus, 'Florida PSU* 2025 was assigned to Years 1 2nd 5, Alabama PSU was 

assigned to Year 3, "Louisiana PSU 2071 was assigned to Year 4,, and Tennessee 

PSU 2157 was assigned to Year 2. This procfedltite was continT0|d to assign 

the remaining selected units to years* * 

♦ « 

When a unit was selected for mpre than one year, control was dot 
exercise^ to insure balance between years* The assignment was said to be 

balanced if all units were assigned to different years when at most five, 

v - 
units were selected, if at most 2 units were assigned per year when between 

SI 0 

6 and' 10 units were selected; and if at, most 3 units were assigned per year 
when between 11 and 15 units were selected After the sampl^ was selected, 

\ 

the number of multiply selected units were enumerated and \jeass^ned to 
years to make the assignment balanced. 



Table 4-6.. Assignment of selected units to years 



State - 


Primary sampling unit 


Sample year 


Replicate 


status 


Florida 


2025 


* 1 




2 




►'Florida 


2025 • 


5 




V 




Alabama 


2073 


3 


s 


2 




Lotus iana 


2071 

* 


4 




2 




^Tennessee 


. 2157 

• 


2 


■ 


- 2 




Tennessee 

* 


2157 


5 




2 




Tennessee 


2037 


, 2 




2 




Kentucky 


2111 


3 


» 


2 




Kentucky* * 


2111 


4 




2 




North Carolina 


2119 

r 


1 


» 


2 




^ ' Georgia • r 


2121 ' 


4 








/ Georgia 


2089 


3 




1 




/ Florida 


2103 


1 


e 






x Florida * 


2031 


2 








' Flori^>^/ , 


2057 ' 


5 








• 


• 

« 






« 






• 











OPTIONS FOR LARGE AND SMALL 



AL SAMPLES 





When the decision was made in jtear 07 for economic reasons to reduce 

N 

the sample- to 75 , travelNpoints/^nd 550 schools per age class, it was hoped 



that full funding would be restored in the futjure and samples of 1,000 

% • 

schools pgr? age class* could be selected. Prior to Year 07, samples con- 
sisted of 246 jreplicates or about 116 V travel points and 1,000 schools per 
age class which yielded 2,592 responses per package since 'each session 
yielded about 12 responses (12 x 216 = 2,592)1 In Year 07, and all suc- 
ceeding years, the number of replicates was reduced to 162 thereby decre- 
menting the number of* travel points *to 75 an^t decreasing the numbers of 
schools to 500 per; age class. A total of 2,592 responses per package were 
still obtained since the mimher of respondents .per session was increased to 
16 (16 x 162 = 2^592). The Year 11 A-year primary sample vas designed tfo 
accommodate either 75 PSUs and 550 schools per age class or 100 PSUs and 
1>0Q0 schof&ls per age 'class. This objective was m£t by defining a tilth 
primary sample which could be-used^for the dual purposes of (l)^Kgmenting 
the 75 PSU primary sample to 100 ^SU. level or (2) providing replacement 
PSUs for those which refuse. The supplemental sample Jdi used to identify 
Replacement PSUs for those refusing by locating tM PSU selected from the 
same zone as the refusing one. The remainder of "this chapter explains how 
the supplemental sample* is used to augment: tfr^sample to^(he 100„ PSU level 
5.1 Primary Sample 

In order, to increase an* annual four-year primary sample to 100 PSU 
the supplemental sample is partitioned into A subsamples., The s 
are then randomly assigned to the A years. 




57 



Subsamples are balanced viffh respect to number of PSUs, number of 
replicates, and regional and SDOG- allocations*, ^£ne- such partitioning *is 
shown in table 5-1. Here the total replicates of 162 is partitioned into 3 
sets of 41 and 1 -set of 39. The total number of 83 PSUs is partitioned 
into 3 sets^ of 21 and 1 set of 20. The sample allocation in terms of 1-, 

2- and 3-replicate units is also depicted in the table for each subsample. 

Thus, if subsample 2 was selected t<? augment a particular annual 
sample* the total number of PSUs would be 104 (83 + 21), the total -number 
of replicates would be 203 (162 + 4l) ^ and the total number of 1-, 2-, and 

3- replicate units would be 10 (8+2), $9 (71 + 18), and 5 (4 + 1), respec- 
tively. > 

5.2 Secondary Sample 

- c\ -. ' " 

If the 75 PSU option is elected, the total number of schools selected 
per age cj.a6s would be about 550. For the 100 PSU option, approximately 
1,000 Schools' per age class woulcTbe selected. The total numbers of schools 
for each age would be apportioned among the PSUs in proportion to the 
replicate status of the PSU. The school sample would be selected adhering 
as closely as possible to this allocation. - ~ 

As noted earlier, PSU's were formed so as tt> contain a minimum of 
1,500 age class eli^ibles. This number of el^gibles can accommodate the 
selection of either 550 or 1,000 schools per age class. Table 5-2 compares 
the numbers of schools selected per replicate and per PSU Under each option. 



58 



c 



Table J>-1. # Partitioning supplemental samples into 4* subsamples« 



Subsamples 





l 






2 






3 ' 






4 






Total 




- 

Region SDOC' 


l-rep 2-rep 


— fc— 

3-irep 


l-rep 


2-rep 

* : > 


3-rep 


. l-rep 


2-rep 


3-rep 


l-rep 


2-rep 


3-rep/ 


l-rep 


2-rep 


3-rep 


1 1 


2 • ) 


1 




1 






i 

'1 


_ 




1 


( 


0 


5 


1 

• 


2 






— 


1 


1 * 




1 


_ 




1 




0 


3 


i 


3 


- 2 V 




- 


2 




- 


1 


- 




1. 


- 


o 


6 


0 


6 


1 












1 






1 




1 


2 


0 


7 






1 . 


















1 


0 


0 


Subtotal 


T 4 / 




1 


4 


I 


0 


4 


0 


0 - 


4 


0 


2 


16 


2 


' 2 1 


1 




„ 


1 


_ 




1 






\ 




0 


3 


0 


2 












1 








1 




1 


1 . 


•0 


3^ 


2 


- 




1 


- 




1 


- 


- 


■1 


- 


0 


5 


0 


• 4 


1 






2 






2 ' 




- t 


1 




0 


6 


0 


•5 


1 
















1 


1 




1 


2 


0 


Subtotal 

.T 


0 5* 


0 " 


0 


4 


0 


T 


4 


0 


1 


4 


0 


2 


17 


0 


3 1 


' - - 1 




_ 


2 






2 


_ 




2 


- 


* 0 


7 


0 


2 ' 


" 1 






1 




• 












0- 


» fi* 2 


, 1 . ■ 


3 


1 


- 


- 


1 




- 


2 


- 


- 


1 




0 


5 


0 


4 


1 " 






1 






1 






1 




1 


3 


0 ' 


• , 5 


1 - I 






1 






1 






1 




0 


_4 


0 


jUDtocai 


1 4 


U 


U 


0 


U 


a" 

u 


0 


1 


u 


C 
J 


A 
U 


1 

1 




■ 1 


4 1 


2 


- 




2 


- 


- 


2 




T 


2 


"l 


b 


8 . 


1 


2 






i 














i 




1 

I 






3 


2 






1 


_ 




-. J 




_ 


1. , 


mm 


0 


5 


0 


4 


1 






1 




1 












1 


2 


0 ' 


5 














1 




1 






• 1 


1 


0 


* 


0 5 l 


i \ 


T 


4 . 


0 


T 


4 


0 


T 


4 


T ' 


3 


17 


1 


'Total 


2 18 


i 


2 


18 


1 


2 


^8 


1 _ 


2 


17 


l 


8 


71 ^ 


, 4 


Total PSUs 


21 






21 






21 






20 






83 




Total Replicates 41 




* 


n . 






41' 




39 






162 




a 































ERIC 



59 



G'Q 



->6- 



Table 5-2. Numbers of schools per replicate and per PSU for each option 



550 schools and 1000 schools and 

75 travel points * 100 travel points 



Number of replicates 

Number of schools 

Number of schools/replicate 



162 
550 
3.40 



lg\ +: 162/4 = 202.5 
1000- 
4 .-94 . 



\ 



Number of-BSU's, 
,Nuraber of schools 
Number of schools/PSU 



83 
550 

6.63 * 



83. + 83/4 = 103/75 ' v 
1000 - ' 
9.64 



ERIC 



61 



6.' SELECTION TECHNIQUES 



lique 



A stratified probability proportional-to-size selection technique yas 
employed to select the sample. The size measure was the variable OVERSIZE 
as defined in section 3-2. The alogrithm for sample f selection is described 
by Chromy (1979) . 1 „ 



- ' ' -58- 

« . — A 



\ 

7. SPECIAL POPULATIONS 



When- the au Hldinat ed four-year primary sample was selected in Year 11*, 
there was concern that over the period of four years the sample might need 
to be directed toward certain minority populations. Requests had already 

p 

been received to report, data separately for Hispanics, and a special report 
was prepared by NAEP for this purpose by aggregating data across exercises' 
and reporting mean values. At one -time the opti'on to include Miami in the 
four-year sample with certainty was considered to ensure a sample size 
adequate to report Hispanic data separately by exercise. This idea was 
rejected because a more general solution to the minority population problem 
was desired. At the time, the interest was in Hispanics but it was felt 
that over the period of four years interest might shift to Asians or 
Indians. Enough Black responses are obtained to report exercise-level data 
for the Black subpopulatibn* 

* To solve £he minority problem, it was decided to include on the frame, 
county-level counts of numbers of Asians, Blacfks, Hispanics, and Indians. 
These counts were obtained from 1570 Census data as described in section 
3.2 of this report (see ASIANPOP, ASIZE, BLACKPOP, BSIZE, HISPOP, HSIZE, 
and INDIAN, ISIZE)^ The sample was selected after the frame was stratified 
by percent minority which is a combination of all the races listed above. 
The stratification of the frame is explained in section 4. The purpose of 

the stratification was . to balance the allocation of minority population to 

— \ 

annual samples. - . x 

Table 7-1 was prepared to display the weighted estimates of each type 
of minority population included in each annual sample and the supplemental 



ERLC 



63 



ind 17 -year-olds were also included to verify each sample. The ad- 



•sample. . Estimates are Included for average numbers of 9-, 13-, and 17- 
year-olds, Asian population, Hispanic population, Indian population, Black 
population, and adjusted numbers of 9-, ,13-, and 17-year-olds. Estimates 
were .computed by determining the weights for each PSU and then sufeming the 

\ • ■ 

cross\ product of the yeight and the estimate across all PSU. Adjusted 9-, 

13-, It 

\ 

justed estimate was computed by counting the oversampled population in each* 

« 

• — 
primary sampling unit twice.' *The adjusted estimate should be constant from 

t 

sample-Go-sample. The arithmetic mean, standard error, relative standard 
error, maximum, *and minimum estimate by type of population*are also includ- 
ed in the table. The relative standard error is. the ratio of the standard 
error to the arithmetic mean. The frame (or t:rue) value was obtained by 
summing the estimate v over the entire frame. 

Tables 7-2 through 7-7 list weighted estimates of the minority popmla- 
tions by region and SDOC category for each annual sample, the supplemental 
sample, a.nd the frame. The adjusted numbers of 9-, 13-, and 17-year-olds 
for each region and SDOC category are constant from sample-to-sample and 
for the frame as they should be." It was noted that the supplemental sample 
region 4 and SD0C4 category contained unusually high Asian and Indian 
populations. Further investigation revealed .that the large numbers ai*ose 
because the Alaska Balance was selected from the region by SDOC category. 

/ 

* * 

m 



W 

i 

64 



Table 7-1. Weighted estimates of minority populations from 4-year primary sample 



Year 


Average 
9,13,17 


v Asian 
population 


Hispanic / 
populatiofr ; 


Indian 
population 


Black . 
population 


Adjusted 
9,13,17 


w 1 1 

*» i. 1 


O , O.O 1 , J / 1 


*l 6 ^79 


*■ 

9fiQ 6fiQ * 

, UU7- 


93 618 


349 341 


4 315 927 


12 


1 87S 8Q7 


6S 1Q 1 




8 ,363 


415,444 

* 


4 315 928 




1 86Q 9S1 

J , OU7 , ^- J 1 


91 1 ft 1 

4 i , IO 1 


910 Oil 

L iu , 7 O i. 


1Q l4s 


439 770 


U IIS 928 


14 


O , O JO , JjU » 


1 7 Q6l 


1S7 0£6 


21 Q00 


rt. 421 ,680 


4 IIS 928 


Qimnl a m on f a 1 

ouppicineii L^l 


j , ooz , i y o 


ill 
£4 ,111 


1 QA 90A 
1 04 , Z74 


i ^ Qftn 
io , you 


lft8~ 7&9 


£' lliR Q97 


X 


3,869,409 


31,004 


200,230 


18,241 


402,995 


4,315,928 


s 


9,487 


20,450 


31,005 


> 6,419.5 


357137 




R&E 


.25% 


66% 




35% 




0% 


Frame Value 


3,868,400 


28,040 


203,983 


16,791 


413,643 - 


4,315,926 


Maximum 


3,-881,371 


65,391 


239,268 


23,900 


' 439,770 




Minimum 


3,858^3a0_ 


' 16^72 


c .157,046 


8,363 


349,351 





9 

ERIC 



65 



66 



-61- 



Table 7-2. Weighted estimates of minority populations iy 
„ region and SDOC for Year 11 



/ 



FRir 







* 




























Region 


SDOC 


Average 


Asian 


Hispanic 


Indian 


Black 


Adjusted w 


9,13,17 


population 

e 


population 


population 


population 


9,13,17* 
* 




1 
1 


zoy , / 


K i/r /.a 


24322 .2 


- 

274.9 


61664' 


337,519. 


i 


Z 


ooi o n A 

zol , zy4 


ZOO / T 
033.47 


1973.0 


216,9 


1&572 


231,294 ^ 


1 
1 


o 


o o i a z c 
jzl , hod 


/OO . 08 


. 5930.6 


> 

446 .9 


15604 


321,465 


1 


Z 
o 


IOC Oil « 
iZO , O 1 1 * 


TOO C A 
1 . 00 


i»T C A / 

*750 .4 


260.7 


2907 


125,811 


X 

r 


7 


1 1 , ooy 


J . JO 


C T C 

57.5 


4.8 


3050 


23, 377 




1 

X 


1 A A All 

, Hi 1 


9 A 1 C A 
Z4 1 . 04 


1 OOAO Z 

lzooy . 0 


T OO / 

183 . 4 


22520 


171, 171 


2 


0 

1 


on nin 


09 1 on 
OZ 1 . Jy 


i cm a 
1977 . 0 


TZ t 

76 . 1 


rt / AA 

8699 


90, 010 


2 


9 
J 


Z / Z , J .31 


A 1 A A 9 
**14 . 4Z 


OZ 1 Q O 

joiy . 8 


/ A A T . - 

400 . r 


v 34945 C 


• 272,331; 


2 

p 


A 


9;tA 1^1 


9.1 9 00 
OlZ . y 0 


O A 9 1 O 
34/ 1 . Z 


oca"*/. 

259 .4 


^33661 


V O ^ / T C T 

X31-4,151 


2 K 


s 


69 AQ^ 
OZ , Hj 0 


9A 9fi 
JO . OU 


1 99 C. O 
• 1330.3. 


0 c c 

25 .5 


T C T O C 

15735 


1>0 / A OA - 

.124, 989* - 


3 


X 


991 H9A 

OZ 1 , UZH 


1 1 A 7 
1 loo . 0 / 


O 1 9 9 J. 

y 1 jZ'th^ 


"c Z a n 
560 . 7 


C O O A A 

53800 


O O.O AO/ 

3S2,A34 


3 


2 


1 Rft 1^1 


09A 59 1 
OZU . 0.3 


. 3<>yo . 1 


J 

TO/. O 

lz4 . 3 


/.coo 

4583 


1 O £ t C 1 

1-86, 151 


3 


9 
•J 


96ft 67Q 

ZOO jO/7 


0 1 9 09 

jiz . y z j 


jzoo . 1 


OZO A * 

262 . 0 


1 OAO / 

13084 v 


O^O /T T n 

268,679 


« 9 


A ^ 


1 ftft £0 7 
100,07/ 


1A7 9/: 
14 / .JO 


C /. O A O 

5439 .8 


' , 135.1 


800 


188,897 


3 




in^ 7 A£ ; 

1 Uj j / Uj 


56 A7 
00 . H / 


9Z5.3 f 


C AO A 

603 .9 


431 


211,410 


4 


, i 


449,542., 


6770.77 


7^38.6 


4191.1 


39409 


496,084 


4 


2 


78,696 


174.62 


7147.9 


r 181-4 . 


2374 


78,696 


4 . 


3 - 


268,835 


2582.30 „ 


24972.3 


* 986.4 


13090 


268,835 


4 


4 


138,779 


455.26 


17510.0 - 


- 13448.8 _ 


3837 


138,779 


4 


5 


41,672 


47.75 


1240 ! 1- 


9^.4 * 


580 


83 343 

\ 








<*• 










-** 








67 









^62- 



Table- 7-3* Weighted estimates of minority .populations by' 
region and SDOC for Year 12 




. ' f . Average Asian 
Region SDOC 9,13,17 # - population 



: y : 

Hispanic Indian 
population population 



Black 

population 
i 



AdjOsted' 
9,1A17 



• 


1 


258,233 


1355.3. ' 


13187.5 


404.15 ' 


53580 . 


337,519\ 




2 


231,294 


821.7 


3164.9 


105.53 


4931 , 


. 231,294 


1 


'3 


321,465 


726.5 


7841,5. 


175.99 


11600 ' 


321,465 


! 


6 


125,frll 


151.4 


•700.6 


67.30 


766 


125,811 


1 


7 


11,689 


5.5 


109.4 


8.30 


' 51 


23,377 


2 


. 1 


133,15-0 


347.5 


2591.5 


, 94. -$4 


34666 


171,171 s ' 


2 


2 


90,011 


• 64.8 


868.7 


63.70. - 


1913 ' 


90,011 


2 


3 


272,331 


813-.0 


4l26/*6 


" 254.75 


45728 


272,331 


2 


4 


3l4;i51 


269.4 


27 ) 1 -.^ 


\ 209.16- 


107567 


314,151 . 


2 


5 


62,495* 


51.4 


342.1 


' 24.63 




124,989 • 


3 


1 


. ,326,457 


lO'OS^ 


7767.4 


656.15 


49925 


382,934 


3. 


2 


1-86,151 


22s\_ 

« 


■ 5404.4 


198.43 . 


6041 " 


186,151 * 


3 


3' 


268,679 


'570.1 


4385.9 


586.11 


10530 f^, 


268,679 


3 


4 


188,897 


213.8 


2141.6 


354.68 ' 


7833 


188,897 


3 


5 


105,705** 


50.8 


. * 1623.2 


141.51 


146 


211,410 


4 


1 


451,396 ' 


35387.5 


77803.7 


1414.03 ' 


33498 ~ 


496,084 


4 


• 2 


7.8,696 


734.3 


4154.4 


149.50 


1329 


78,696 


4 


- ' 3 


268,835 % 


2374.8 < 


67873.3 


'1359.99 


11048 


268,835 


4 


4 


138,779' 


19g91.7 


23704.7 


1747.92 


1607 


138,779 


4 


5 


41,672 


227.3 


,8765.5 • 


346.25 


- 1473 


'83,343 



ERIC 



68 



\ 



-63- 



Table 7-4. Weighted estimates of minority populations b,y 
region and SDOC ftr Year 13 



Average 'Asian' Hispanic Indian 
Region SDOC 9 , 13^17 population population population 



Black Adjusted 
population .9,13,17 



1 
1 
1 
1 
1 
2 
2 
2 
2 
2 
3 
3 
3 
3 
3 
4 
4 

4 

o 

4 

<-4' 




1 260,79-5, 

2 231,294* 

3 321,465 

6 125,811 

7 *ll',689 

1 149,818 

2 90,011 

3 272,331 
- % 

4 * 314,151 

5 62,495 

1 ' 316,294 / 

2 186,151'. 
3" 268,679 
4 188,857 
5' 105,705 
1. • 435", 683 

2 78,696 

3 268,835 

4 138,779 

5 41,672 - 




67332 
12938 
16793 

7593 
42 
28999 

8139 
54857 
72166 
* 25275 
66505 

-2346 
17399 
602° 

2498 » 
37702 

2121 

7656 

8707 
101 



337,519 
23*1,294 
321,465 
•125,811 
23,377 
171,171 
90,011 
272,331 
314,151 
124,989 

• 38,2,934. 
186,151 
268,679 
188,897 
21 r, 410 
496,084 
78,696 
268,835 

M 38, 779 
83,343 



9 

ERIC 



'I 



69 



ft 



f 4 

Table 7~^> Weighted estimates, of minority populations by 
region and SDOC for Year 14 



Average Asian Hispan^jc Jndian Black Adjusted 

Region^ SDOC 9,13,17 population population population papulation 9,13,17 





1 

0 * 


9£& 087 
ZOO , 70 / 


17Q1 


Q 9 

. yz 


1 AO 1 1 

1 4y 1 1 


7 




7fi ' 


^ 7 9 7 A 
0 / Z / 4- 


007 C1Q 




o 
z 


OOi 90 A 

z j i , zy 4 


7jO 


91 
• Z J 


A91 A 
HZ 1 4 


A 


90ft 
ZUo 


17 


1 QQflft 
iojUo 


911 0QL 
ZO 1 , Z7 4 




o 


l9l A£^ 
JZ 1 , HO J 


OUO 




Aft^9 
4oOZ 


. 1 




DU 


171^7 
1 / ID / 


f 191 A6S 






1 9S ft 1 1 


1 OL 

1 U4 


9ft 
. zo 


1 ISO 
1 d d y 




r 67 


IS 


A6S0 


1 9S 811 




7 


1 1 6ftQ 




1 1 

• 1 0 


1 11 
1 00 


Q 


0 

z 


SI 


9Q 
zy 


91 liX 


2 


1 

X 


107 Ql 7 

L£m 1 ,01/ 


90S 


QS 


11 00 


0 


8S 

OJ 




1QQAS 


171 171 
1/1,1/1 


* J 


24. 


r * 90 01 1 


12 






.2 


62 


99 


684S 


90,011 


2 


1 


979 Ill 


184 

jot 




.*1&76 


-7 


1QA 


1 9 


S01Q0 


972 111 


2 


4 


314, 151 


185 


.48 


2643 


,3 


1342 


06 


89035 


314, 151 


2 


s 


6? 49S 


25 


.01 


787 


7 


IS- 


10 


1 0008 


124 989 


3 • 


1 


111 SSI 


1168 


37 


' V \.8S99 


7 


489 


80 


SOUS 

JuJJJ 


* 18^5^4 

JUL , 7Jt 


3 


. 2 


186 1S1 


311 


.81 


s\qq 


0 


210 


79 

f z • 


1S9S 


186 1ST 


3 


3 


268,679 


335 


04 


3633 




284. 


83 


13244 


• 268/679 


3 * 


4 


188,897 


368 


39 


,1436 


4 


440. 


14 • 


1227 


.188,897 


3 


5 


105,705 


53 


97 


783 


9 


4659. 


80 


104 


211,410 


4* 


* 1 


425,316 " 


5500 


28 


' 65758 


4 


^y-1878. 


37 ' 


38993 


496,084 


4 


2 


78; 696 


1068.30 ' 


5970. 


1 


212. 


05 


3956 


78,696 


4 


3 


. 268,835 


4334 


46 


24836. 


8 


2111. 


19 


11360 


268,835 


4 


4 


138,779. 


. 495- 06 


3064 


5 


9689. 


50 


2504 


, 138,779 


4 •' 


5 


41,672 


29. 



27 


1955. 


6 


• 1458. 


95 


2592 


83,343 



Table 7-6. Weighted estimates of minority^opulations by 
region and SDOC for supplemental year 



. Average Asian Hispanic Indian Black * Adjusted 

Region SDOC 9,13,17 population population population population 9,13,17 





- 1 


270,016 


1032.5 


19009.8 


''355*56 


51597 


337,519 




2 


231,294 


" 559.8 


2448.8 


143.39 


9828^" 


\23 1,294 


1 


3 


321,46^, 


584.1 


4452.4 


171.47 


15822 


32l/4^S 


1 • 


6 

p 


125,811 


#4.7 


1153.0 


84.68 


1087 


... 125,811 


, 


7 


11,689 


5.6 


57.5 


■ 4.77 


3050 


23,377 


2 


1 


138,835 


/ 273.5 


17592.0 


113.03 


29431 


171,171 


2 


2 


84,503 


•412.8 


1359.8 


87.53 


13*263 


90,011 


2 


• 3 


272,331 


497.6 


2857.1 


145. SO 


77907 


• 272,331 


2. fc 


4 


314,151 


349.9 


3842.5" 


- 165.40 


27350 


314,151 


2 


5 


62,495 


22.5 


877.0 


<73.6l 


13415 


124,989 


3 


1 


324,674 


.1294.4 


8623.5 


\ 646.23 


55777 


382,934 


3 


2 


186,151 


351.3 


2198.7 


\ 

\119.84 


3897 


186,151 


3 , 


3 


268 "679 


^94.5 


■6208 6 




1UOO? 


ZOO , O / j 


3 


4 


188,897 • 


131.0 ; 


1587.2 


1\26.84 


770 


188,897 


3 


• 5 


105,705 


. 73.9 


698.9 


' 635.03 

\ 


940 


211,410 


4 


1 


427,519 


9788.3 


64153.0 


163*0.76 


48193 


496,084 


4 


,2 


78,696 


1687.4 


6300:7 

i 


1897.30 


3472 


78,696 


4 


3 


268,835 


2579.7 


18515,1 


. 1 

" 2616.87 


17467 


' 268,835 


4 


4 


138,779 


14019.9 


7547.3 


6652.52 
77j.56^ 


4093, 


138,779 


4 • 


5 


' 4^,672 


37.9 


14810.9 


C 

495 


83,343 




71 



Table 7-7. Weighted estimates of minority populations by 
region and SDOC for frame 







Average 


Asian 


Hispanic 


Indian 


Black 


Adjusted 


egion 


SDOC 


* 9,13,17 


population 


population 


population 


population 


9,13,17 


• 


1 

'• 2 


26A.007 


1835.4 


21168.2 


337.16 


58845 


337,519 




231 ,294 


761.0 


'3457.4 * 


177.57 


11329 


231,294 • 




3 


321 ,465 


722.6 


6226 .2 


250.84 


14883 


321 ,465 




6 


* 125,811 


166.0 


877 . 0 


19? 96 

17ti 7U 


JwvJ 


1?S All 




7 


11 , 689 


5.5 


74.0 


4.34 


823 


23,377 


z 


1 


13o , 765 


* 

315 . 1 


8146 . 7 


113.91 


32985 


171,171 


Z 


z 


on noi 
By , UJl 


ZD J . Z 


icon A 

1590 . h 

* 


83.49 


10406 < 


90 ,01 1 


O 
£. 


Q 
J 


Z /Z , Jjl 


A A £ A 


*y A A 0 c 


349 .77 


56228 


272^331 


z 


/, 


314, 151 


236 . 7 


3180*8 ^ 


1031 . 12 


65559 


31^,151 


z 


c 
D 


oz , 495 


31 . 1 


"7 O O O 

782 . 8 


101 .91 


17724 


1*4,989 


J 


1 
1 


jz j , uiy 


lZZM . 1 


8776,9 


585.34 


56826 


382,934 


J 


Z 


1 ft A 1 ci 


OOO A 
jZO . H 


3z98 . 3 




5700 


186 , 151 


3 


3 


268,679 


459.6 


4825.7 


530.27 


13872 


268,679 


^ 1 
-5 ! 


/, 


1 0 0 0 m 

loo , 897 


IT/ O 

17A . 3 


2573.5 


823. 70 


2735 


188 ,897 


3 


5 


1O5-T705 


58.3 


885.5 


817.85 


476 


211,410 


A 


1 


A38.930 


'1639A.0 


67886.1 


2361.36- 


39944 


496,084 


A 


2 


-78,696 


776.2 


6626.1 


559.^ 


3188 


78,696 


A 

* V 


3 4 


268,835 


2832.0 


38010.0 * 


1793.20 


11303 " 


268,835 
s 




\* 


138,779 


981. A 


15593.4 

• 


5561,80 


6208 


138,^7^ 


A 


' 5 


Al,671 ' 


88.7 


6560.3 


937.07 


1586 


83.3A3 ■ 



/ 



REFERENCES 



Chromy, James R. (1979), "Sequential Sample Selection Methods/ 1 Proceedings 
of the Section on Survey Research Methods , Ameri<fan Statistical 
Association, pp. 401-406. 




